This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
- From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Tue, 22 Mar 2016 16:26:11 +0000
- Subject: [Bug string/19776] Improve sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
- Auto-submitted: auto-generated
- References: <bug-19776-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=19776
--- Comment #11 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".
The branch, hjl/erms/hybrid has been created
at 942d5a67c652603257c4edcf9ee5d05951a454cb (commit)
- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=942d5a67c652603257c4edcf9ee5d05951a454cb
commit 942d5a67c652603257c4edcf9ee5d05951a454cb
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Mar 22 09:19:06 2016 -0700
Use Hybrid_ERMS in mempcpy.S
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=64b4df537063bb843ad07495ec4de0670a8a15fb
commit 64b4df537063bb843ad07495ec4de0670a8a15fb
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Mar 22 09:02:01 2016 -0700
Use Hybrid_ERMS in memcpy.S
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6680feac496d18c433f4355b81c0f789848965ff
commit 6680feac496d18c433f4355b81c0f789848965ff
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 12:36:03 2016 -0700
Add Hybrid_ERMS and use it in memset.S
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=fe2e2e385789f2ff297bfbc73cb11af8b43b8345
commit fe2e2e385789f2ff297bfbc73cb11af8b43b8345
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 10:34:07 2016 -0700
Add __memset_avx2_erms and __memset_chk_avx2_erms
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5540fbe17391c4d495ac23559c0f33b08394173d
commit 5540fbe17391c4d495ac23559c0f33b08394173d
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 10:27:58 2016 -0700
Add avx_unaligned_erms versions of memcpy/mempcpy
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8523b446015dfcbd9c976194f8bed75534472243
commit 8523b446015dfcbd9c976194f8bed75534472243
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 10:07:48 2016 -0700
Remove mempcpy-*.S
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=35aeba6cd4e06022d08f5e37ecbf94a37c9880f4
commit 35aeba6cd4e06022d08f5e37ecbf94a37c9880f4
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 13:37:31 2016 -0800
Merge memcpy with mempcpy
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c69e56fd19790341bc1cdf43adb14a7d033b9e16
commit c69e56fd19790341bc1cdf43adb14a7d033b9e16
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 09:22:56 2016 -0700
Add __memset_sse2_erms and __memset_chk_sse2_erms
* sysdeps/x86_64/memset.S (__memset_chk_sse2_erms): New
function.
(__memset_sse2_erms): Likewise.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memset_chk_sse2_erms and
__memset_sse2_erms.
* sysdeps/x86_64/sysdep.h (REP_STOSB_THRESHOLD): New.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1e4466336442ac6b5f4537bcd3f641ab8899d47e
commit 1e4466336442ac6b5f4537bcd3f641ab8899d47e
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Mar 18 08:32:05 2016 -0700
Add sse2_unaligned_erms versions of memcpy/mempcpy
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned_erms,
__memcpy_sse2_unaligned_erms, __mempcpy_chk_sse2_unaligned_erms
and __mempcpy_sse2_unaligned_erms.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(__mempcpy_chk_sse2_unaligned_erms): New function.
(__mempcpy_sse2_unaligned_erms): Likewise.
(__memcpy_chk_sse2_unaligned_erms): Likewise.
(__memcpy_sse2_unaligned_erms): Likewise.
* sysdeps/x86_64/sysdep.h (REP_MOVSB_THRESHOLD): New.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a6358872399eb796c326572b15a37e504173888b
commit a6358872399eb796c326572b15a37e504173888b
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:47:26 2016 -0800
Enable __memcpy_chk_sse2_unaligned
Check Fast_Unaligned_Load for __memcpy_chk_sse2_unaligned. The new
selection order is:
1. __memcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_chk_sse2 if SSSE3 isn't available.
4. __memcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_chk_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=acae6b0a49cc462a67bccb7e11a74ac720d98427
commit acae6b0a49cc462a67bccb7e11a74ac720d98427
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:44:58 2016 -0800
Enable __mempcpy_chk_sse2_unaligned
Check Fast_Unaligned_Load for __mempcpy_chk_sse2_unaligned. The new
selection order is:
1. __mempcpy_chk_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __mempcpy_chk_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __mempcpy_chk_sse2 if SSSE3 isn't available.
4. __mempcpy_chk_ssse3_back if Fast_Copy_Backward bit it set.
5. __mempcpy_chk_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Check
Fast_Unaligned_Load to enable __mempcpy_chk_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=cf9b1255b58424eaa9b36a8c6d173fa1dba030c7
commit cf9b1255b58424eaa9b36a8c6d173fa1dba030c7
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Mon Mar 7 05:42:46 2016 -0800
Enable __mempcpy_sse2_unaligned
Check Fast_Unaligned_Load for __mempcpy_sse2_unaligned. The new
selection order is:
1. __mempcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __mempcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __mempcpy_sse2 if SSSE3 isn't available.
4. __mempcpy_ssse3_back if Fast_Copy_Backward bit it set.
5. __mempcpy_ssse3
[BZ #19776]
* sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Check
Fast_Unaligned_Load to enable __mempcpy_sse2_unaligned.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5751e670acd3102f74b0fa5e5537e32d7e0b59be
commit 5751e670acd3102f74b0fa5e5537e32d7e0b59be
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 17:06:41 2016 -0800
Add entry points for __mempcpy_sse2_unaligned and _chk functions
Add entry points for __mempcpy_chk_sse2_unaligned,
__mempcpy_sse2_unaligned and __memcpy_chk_sse2_unaligned.
[BZ #19776]
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __memcpy_chk_sse2_unaligned,
__mempcpy_chk_sse2_unaligned and __mempcpy_sse2_unaligned.
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(__mempcpy_chk_sse2_unaligned): New.
(__mempcpy_sse2_unaligned): Likewise.
(__memcpy_chk_sse2_unaligned): Likewise.
(L(start): New label.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=63564757b905fc0f874c387fb905aa3757bb3605
commit 63564757b905fc0f874c387fb905aa3757bb3605
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 16:52:53 2016 -0800
Remove L(overlapping) from memcpy-sse2-unaligned.S
Since memcpy doesn't need to check overlapping source and destination,
we can remove L(overlapping).
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S
(L(overlapping)): Removed.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=30e33b2eb15afcbd7039fbfa3202166c0f8af1d1
commit 30e33b2eb15afcbd7039fbfa3202166c0f8af1d1
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 13:46:54 2016 -0800
Don't use RAX as scratch register
To prepare sharing code with mempcpy, don't use RAX as scratch register
so that RAX can be set to the return value at entrance.
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Don't use
RAX as scratch register.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b4f5ff76b6b7974a8c8baea8e70088a1458c96a2
commit b4f5ff76b6b7974a8c8baea8e70088a1458c96a2
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Mar 6 14:16:32 2016 -0800
Remove dead code from memcpy-sse2-unaligned.S
There are
ENTRY(__memcpy_sse2_unaligned)
movq %rsi, %rax
leaq (%rdx,%rdx), %rcx
subq %rdi, %rax
subq %rdx, %rax
cmpq %rcx, %rax
jb L(overlapping)
When branch is taken,
cmpq %rsi, %rdi
jae .L3
will never be taken. We can remove the dead code.
[BZ #19776]
* sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S (.L3) Removed.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=67160949c9f01fd3f258c04f2eb5cee67739c503
commit 67160949c9f01fd3f258c04f2eb5cee67739c503
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Apr 11 08:51:16 2014 -0700
Test 32-bit ERMS memcpy/memset
* sysdeps/i386/i686/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __bcopy_erms, __bzero_erms,
__memmove_chk_erms, __memmove_erms, __memset_chk_erms,
__memset_erms, __memcpy_chk_erms, __memcpy_erms,
__mempcpy_chk_erms and __mempcpy_erms.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a4d669225600cf717543df1627bb47fa1c08fbe8
commit a4d669225600cf717543df1627bb47fa1c08fbe8
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Apr 11 08:25:17 2014 -0700
Test 64-bit ERMS memcpy/memset
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memmove_chk_erms,
__memmove_erms, __memset_erms, __memset_chk_erms,
__memcpy_chk_erms, __memcpy_erms, __mempcpy_chk_erms and
__mempcpy_erms.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=74aea35c82d147f0270bebefae89a66dfb191b1f
commit 74aea35c82d147f0270bebefae89a66dfb191b1f
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Wed Sep 21 15:21:28 2011 -0700
Add 32it ERMS memcpy/memset
* sysdeps/i386/i686/multiarch/Makefile (sysdep_routines): Add
bcopy-erms, memcpy-erms, memmove-erms, mempcpy-erms, bzero-erms
and memset-erms.
* sysdeps/i386/i686/multiarch/bcopy-erms.S: New file.
* sysdeps/i386/i686/multiarch/bzero-erms.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy-erms.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove-erms.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy-erms.S: Likewise.
* sysdeps/i386/i686/multiarch/memset-erms.S: Likewise.
* sysdeps/i386/i686/multiarch/ifunc-defines.sym: Add
COMMON_CPUID_INDEX_7.
* sysdeps/i386/i686/multiarch/bcopy.S: Enable ERMS optimization
for Fast_ERMS.
* sysdeps/i386/i686/multiarch/bzero.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/memcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove.S: Likewise.
* sysdeps/i386/i686/multiarch/memmove_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy.S: Likewise.
* sysdeps/i386/i686/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/i386/i686/multiarch/memset.S: Likewise.
* sysdeps/i386/i686/multiarch/memset_chk.S: Likewise.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a81f1a5e58f9360afd079093ea6770bb9e570a2a
commit a81f1a5e58f9360afd079093ea6770bb9e570a2a
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Sep 15 16:16:10 2011 -0700
Add 64-bit ERMS memcpy and memset
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
memcpy-erms, mempcpy-erms, memmove-erms and memset-erms.
* sysdeps/x86_64/multiarch/memcpy-erms.S: New.
* sysdeps/x86_64/multiarch/memmove-erms.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy-erms.S: Likewise.
* sysdeps/x86_64/multiarch/memset-erms.S: Likewise.
* sysdeps/x86_64/multiarch/memcpy.S: Enable ERMS optimization
for Fast_ERMS.
* sysdeps/x86_64/multiarch/memcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memmove.c: Likewise.
* sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
* sysdeps/x86_64/multiarch/mempcpy.S: Likewise.
* sysdeps/x86_64/multiarch/mempcpy_chk.S: Likewise.
* sysdeps/x86_64/multiarch/memset.S: Likewise.
* sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=487dc028667e43ed55f407fda3c07fc31ecd1554
commit 487dc028667e43ed55f407fda3c07fc31ecd1554
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Sep 15 15:47:01 2011 -0700
Initial ERMS support
* sysdeps/x86/cpu-features.h (bit_arch_Fast_ERMS): New.
(bit_cpu_ERMS): Likewise.
(index_cpu_ERMS): Likewise.
(index_arch_Fast_ERMS): Likewise.
(reg_ERMS): Likewise.
-----------------------------------------------------------------------
--
You are receiving this mail because:
You are on the CC list for the bug.