This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug dynamic-link/21871] _dl_runtime_resolve_avx_opt is slower than _dl_runtime_resolve_avx_slow


https://sourceware.org/bugzilla/show_bug.cgi?id=21871

--- Comment #7 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, release/2.26/master has been updated
       via  799859f6635d68487ea2472bd79d96a7639a1ab1 (commit)
      from  a4e5aa1a443cfad09bc98f9bb527995371a53a88 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=799859f6635d68487ea2472bd79d96a7639a1ab1

commit 799859f6635d68487ea2472bd79d96a7639a1ab1
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Aug 6 10:44:30 2017 -0700

    x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]

    On AVX machines with XGETBV (ECX == 1) like Skylake processors,

    (gdb) disass _dl_runtime_resolve_avx_opt
    Dump of assembler code for function _dl_runtime_resolve_avx_opt:
       0x0000000000015890 <+0>: push   %rax
       0x0000000000015891 <+1>: push   %rcx
       0x0000000000015892 <+2>: push   %rdx
       0x0000000000015893 <+3>: mov    $0x1,%ecx
       0x0000000000015898 <+8>: xgetbv
       0x000000000001589b <+11>:        mov    %eax,%r11d
       0x000000000001589e <+14>:        pop    %rdx
       0x000000000001589f <+15>:        pop    %rcx
       0x00000000000158a0 <+16>:        pop    %rax
       0x00000000000158a1 <+17>:        and    $0x4,%r11d
       0x00000000000158a5 <+21>:        bnd je 0x16200
<_dl_runtime_resolve_sse_vex>
    End of assembler dump.

    is slower than:

    (gdb) disass _dl_runtime_resolve_avx_slow
    Dump of assembler code for function _dl_runtime_resolve_avx_slow:
       0x0000000000015850 <+0>: vorpd  %ymm0,%ymm1,%ymm8
       0x0000000000015854 <+4>: vorpd  %ymm2,%ymm3,%ymm9
       0x0000000000015858 <+8>: vorpd  %ymm4,%ymm5,%ymm10
       0x000000000001585c <+12>:        vorpd  %ymm6,%ymm7,%ymm11
       0x0000000000015860 <+16>:        vorpd  %ymm8,%ymm9,%ymm9
       0x0000000000015865 <+21>:        vorpd  %ymm10,%ymm11,%ymm10
       0x000000000001586a <+26>:        vpcmpeqd %xmm8,%xmm8,%xmm8
       0x000000000001586f <+31>:        vorpd  %ymm9,%ymm10,%ymm10
       0x0000000000015874 <+36>:        vptest %ymm10,%ymm8
       0x0000000000015879 <+41>:        bnd jae 0x158b0
<_dl_runtime_resolve_avx>
       0x000000000001587c <+44>:        vzeroupper
       0x000000000001587f <+47>:        bnd jmpq 0x16200
<_dl_runtime_resolve_sse_vex>
    End of assembler dump.
    (gdb)

    since xgetbv takes much more cycles than single cycle operations like
    vpord/vvpcmpeq/ptest.  _dl_runtime_resolve_opt should be used only with
    AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake
    server.

        [BZ #21871]
        * sysdeps/x86/cpu-features.c (init_cpu_features): Set
        bit_arch_Use_dl_runtime_resolve_opt only with AVX512F.

    (cherry picked from commit d2cf37c0a2a375cf2fde69f1afbcc49e45368fc4)

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                  |    6 ++++++
 sysdeps/x86/cpu-features.c |    7 +++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]