This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S


https://sourceware.org/bugzilla/show_bug.cgi?id=19776

--- Comment #11 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, hjl/erms/hybrid has been created
        at  942d5a67c652603257c4edcf9ee5d05951a454cb (commit)

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=942d5a67c652603257c4edcf9ee5d05951a454cb

commit 942d5a67c652603257c4edcf9ee5d05951a454cb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 22 09:19:06 2016 -0700

    Use Hybrid_ERMS in mempcpy.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=64b4df537063bb843ad07495ec4de0670a8a15fb

commit 64b4df537063bb843ad07495ec4de0670a8a15fb
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 22 09:02:01 2016 -0700

    Use Hybrid_ERMS in memcpy.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6680feac496d18c433f4355b81c0f789848965ff

commit 6680feac496d18c433f4355b81c0f789848965ff
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 12:36:03 2016 -0700

    Add Hybrid_ERMS and use it in memset.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fe2e2e385789f2ff297bfbc73cb11af8b43b8345

commit fe2e2e385789f2ff297bfbc73cb11af8b43b8345
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:34:07 2016 -0700

    Add __memset_avx2_erms and __memset_chk_avx2_erms

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5540fbe17391c4d495ac23559c0f33b08394173d

commit 5540fbe17391c4d495ac23559c0f33b08394173d
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:27:58 2016 -0700

    Add avx_unaligned_erms versions of memcpy/mempcpy

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8523b446015dfcbd9c976194f8bed75534472243

commit 8523b446015dfcbd9c976194f8bed75534472243
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 10:07:48 2016 -0700

    Remove mempcpy-*.S

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=35aeba6cd4e06022d08f5e37ecbf94a37c9880f4

commit 35aeba6cd4e06022d08f5e37ecbf94a37c9880f4
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 13:37:31 2016 -0800

    Merge memcpy with mempcpy

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c69e56fd19790341bc1cdf43adb14a7d033b9e16

commit c69e56fd19790341bc1cdf43adb14a7d033b9e16
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 09:22:56 2016 -0700

    Add __memset_sse2_erms and __memset_chk_sse2_erms

        * sysdeps/x86_64/memset.S (__memset_chk_sse2_erms): New
        function.
        (__memset_sse2_erms): Likewise.
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memset_chk_sse2_erms and
        __memset_sse2_erms.
        * sysdeps/x86_64/sysdep.h (REP_STOSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1e4466336442ac6b5f4537bcd3f641ab8899d47e

commit 1e4466336442ac6b5f4537bcd3f641ab8899d47e
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Mar 18 08:32:05 2016 -0700

    Add sse2_unaligned_erms versions of memcpy/mempcpy

        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned_erms,
        __memcpy_sse2_unaligned_erms, __mempcpy_chk_sse2_unaligned_erms
        and __mempcpy_sse2_unaligned_erms.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned_erms): New function.
        (__mempcpy_sse2_unaligned_erms): Likewise.
        (__memcpy_chk_sse2_unaligned_erms): Likewise.
        (__memcpy_sse2_unaligned_erms): Likewise.
        * sysdeps/x86_64/sysdep.h (REP_MOVSB_THRESHOLD): New.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a6358872399eb796c326572b15a37e504173888b

commit a6358872399eb796c326572b15a37e504173888b
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:47:26 2016 -0800

    Enable __memcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __memcpy_chk_sse2 if SSSE3 isn't available.
    4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __memcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=acae6b0a49cc462a67bccb7e11a74ac720d98427

commit acae6b0a49cc462a67bccb7e11a74ac720d98427
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:44:58 2016 -0800

    Enable __mempcpy_chk_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_chk_sse2_unaligned. The new
    selection order is:

    1. __mempcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_chk_sse2 if SSSE3 isn't available.
    4. __mempcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_chk_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
        Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=cf9b1255b58424eaa9b36a8c6d173fa1dba030c7

commit cf9b1255b58424eaa9b36a8c6d173fa1dba030c7
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 7 05:42:46 2016 -0800

    Enable __mempcpy_sse2_unaligned

    Check Fast_Unaligned_Load for __mempcpy_sse2_unaligned.  The new
    selection order is:

    1. __mempcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
    2. __mempcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
    3. __mempcpy_sse2 if SSSE3 isn't available.
    4. __mempcpy_ssse3_back if Fast_Copy_Backward bit it set.
    5. __mempcpy_ssse3

        [BZ #19776]
        * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Check
        Fast_Unaligned_Load to enable __mempcpy_sse2_unaligned.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5751e670acd3102f74b0fa5e5537e32d7e0b59be

commit 5751e670acd3102f74b0fa5e5537e32d7e0b59be
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 17:06:41 2016 -0800

    Add entry points for __mempcpy_sse2_unaligned and _chk functions

    Add entry points for __mempcpy_chk_sse2_unaligned,
    __mempcpy_sse2_unaligned and __memcpy_chk_sse2_unaligned.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned,
        __mempcpy_chk_sse2_unaligned and __mempcpy_sse2_unaligned.
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (__mempcpy_chk_sse2_unaligned): New.
        (__mempcpy_sse2_unaligned): Likewise.
        (__memcpy_chk_sse2_unaligned): Likewise.
        (L(start): New label.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=63564757b905fc0f874c387fb905aa3757bb3605

commit 63564757b905fc0f874c387fb905aa3757bb3605
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 16:52:53 2016 -0800

    Remove L(overlapping) from memcpy-sse2-unaligned.S

    Since memcpy doesn't need to check overlapping source and destination,
    we can remove L(overlapping).

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
        (L(overlapping)): Removed.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=30e33b2eb15afcbd7039fbfa3202166c0f8af1d1

commit 30e33b2eb15afcbd7039fbfa3202166c0f8af1d1
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 13:46:54 2016 -0800

    Don't use RAX as scratch register

    To prepare sharing code with mempcpy, don't use RAX as scratch register
    so that RAX can be set to the return value at entrance.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Don't use
        RAX as scratch register.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b4f5ff76b6b7974a8c8baea8e70088a1458c96a2

commit b4f5ff76b6b7974a8c8baea8e70088a1458c96a2
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sun Mar 6 14:16:32 2016 -0800

    Remove dead code from memcpy-sse2-unaligned.S

    There are

    ENTRY(__memcpy_sse2_unaligned)
       movq  %rsi, %rax
       leaq  (%rdx,%rdx), %rcx
       subq  %rdi, %rax
       subq  %rdx, %rax
       cmpq  %rcx, %rax
       jb L(overlapping)

    When branch is taken,

       cmpq  %rsi, %rdi
       jae   .L3

    will never be taken.  We can remove the dead code.

        [BZ #19776]
        * sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S (.L3) Removed.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=67160949c9f01fd3f258c04f2eb5cee67739c503

commit 67160949c9f01fd3f258c04f2eb5cee67739c503
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Apr 11 08:51:16 2014 -0700

    Test 32-bit ERMS memcpy/memset

        * sysdeps/i386/i686/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Add __bcopy_erms, __bzero_erms,
        __memmove_chk_erms, __memmove_erms, __memset_chk_erms,
        __memset_erms, __memcpy_chk_erms, __memcpy_erms,
        __mempcpy_chk_erms and __mempcpy_erms.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a4d669225600cf717543df1627bb47fa1c08fbe8

commit a4d669225600cf717543df1627bb47fa1c08fbe8
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Apr 11 08:25:17 2014 -0700

    Test 64-bit ERMS memcpy/memset

        * sysdeps/x86_64/multiarch/ifunc-impl-list.c
        (__libc_ifunc_impl_list): Add __memmove_chk_erms,
        __memmove_erms, __memset_erms, __memset_chk_erms,
        __memcpy_chk_erms, __memcpy_erms, __mempcpy_chk_erms and
        __mempcpy_erms.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=74aea35c82d147f0270bebefae89a66dfb191b1f

commit 74aea35c82d147f0270bebefae89a66dfb191b1f
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Sep 21 15:21:28 2011 -0700

    Add 32it ERMS memcpy/memset

        * sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
        bcopy-erms, memcpy-erms, memmove-erms, mempcpy-erms, bzero-erms
        and memset-erms.
        * sysdeps/i386/i686/multiarch/bcopy-erms.S: New file.
        * sysdeps/i386/i686/multiarch/bzero-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset-erms.S: Likewise.
        * sysdeps/i386/i686/multiarch/ifunc-defines.sym: Add
        COMMON_CPUID_INDEX_7.
        * sysdeps/i386/i686/multiarch/bcopy.S: Enable ERMS optimization
        for Fast_ERMS.
        * sysdeps/i386/i686/multiarch/bzero.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
        * sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove.S: Likewise.
        * sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
        * sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset.S: Likewise.
        * sysdeps/i386/i686/multiarch/memset_chk.S: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a81f1a5e58f9360afd079093ea6770bb9e570a2a

commit a81f1a5e58f9360afd079093ea6770bb9e570a2a
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Sep 15 16:16:10 2011 -0700

    Add 64-bit ERMS memcpy and memset

        * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
        memcpy-erms, mempcpy-erms, memmove-erms and memset-erms.
        * sysdeps/x86_64/multiarch/memcpy-erms.S: New.
        * sysdeps/x86_64/multiarch/memmove-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/memset-erms.S: Likewise.
        * sysdeps/x86_64/multiarch/memcpy.S: Enable ERMS optimization
        for Fast_ERMS.
        * sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
        * sysdeps/x86_64/multiarch/memmove.c: Likewise.
        * sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
        * sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
        * sysdeps/x86_64/multiarch/memset.S: Likewise.
        * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=487dc028667e43ed55f407fda3c07fc31ecd1554

commit 487dc028667e43ed55f407fda3c07fc31ecd1554
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Sep 15 15:47:01 2011 -0700

    Initial ERMS support

        * sysdeps/x86/cpu-features.h (bit_arch_Fast_ERMS): New.
        (bit_cpu_ERMS): Likewise.
        (index_cpu_ERMS): Likewise.
        (index_arch_Fast_ERMS): Likewise.
        (reg_ERMS): Likewise.

-----------------------------------------------------------------------

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]