This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 12 Apr 2016 08:34:58 -0700
- Subject: PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets
- Authentication-results: sourceware.org; auth=none
On Thu, Mar 31, 2016 at 12:42 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> Since the new SSE2/AVX2 memsets are faster than the previous ones, we
> can remove the previous SSE2/AVX2 memsets and replace them with the
> new ones.
>
> No change in IFUNC selection if SSE2 and AVX2 memsets weren't used
> before. If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset
> optimized with Enhanced REP STOSB will be used for processors with
> ERMS.
>
> Tested on Penryn, Westmere, Ivy Bridge and Haswell with and without
> --disable-multi-arch. OK for master?
>
> H.J.
> ---
> [BZ #19881]
> * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded
> into ...
> * sysdeps/x86_64/memset.S: This.
> (__bzero): Removed.
> (__memset_tail): Likewise.
> (__memset_chk): Likewise.
> (memset): Likewise.
> (MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't
> defined.
> (MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined.
> * sysdeps/x86_64/multiarch/memset-avx2.S: Removed.
> (__memset_zero_constant_len_parameter): Check SHARED instead of
> PIC.
> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
> memset-avx2 and memset-sse2-unaligned-erms.
> * sysdeps/x86_64/multiarch/ifunc-impl-list.c
> (__libc_ifunc_impl_list): Remove __memset_chk_sse2,
> __memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned.
> * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: Skip
> if not in libc.
> * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
> Likewise.
> * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> (MEMSET_CHK_SYMBOL): New. Define if not defined.
> (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH.
> Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk
> symbols.
> Properly check USE_MULTIARCH on __memset symbols.
> * sysdeps/x86_64/multiarch/memset.S (memset): Replace
> __memset_sse2 and __memset_avx2 with __memset_sse2_unaligned
> and __memset_avx2_unaligned. Use __memset_sse2_unaligned_erms
> or __memset_avx2_unaligned_erms if processor has ERMS.
> (memset): Removed.
> (__memset_chk): Likewise.
> (MEMSET_SYMBOL): New.
> (libc_hidden_builtin_def): Replace __memset_sse2 with
> __memset_sse2_unaligned.
> * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace
> __memset_chk_sse2 and __memset_chk_avx2 with
> __memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms.
> Use __memset_chk_sse2_unaligned_erms or
> __memset_chk_avx2_unaligned_erms if processor has ERMS.
> ---
> sysdeps/x86_64/memset.S | 121 +++------------
> sysdeps/x86_64/multiarch/Makefile | 3 +-
> sysdeps/x86_64/multiarch/ifunc-impl-list.c | 9 --
> .../x86_64/multiarch/memset-avx2-unaligned-erms.S | 18 ++-
> sysdeps/x86_64/multiarch/memset-avx2.S | 168 ---------------------
> .../multiarch/memset-avx512-unaligned-erms.S | 2 +-
> .../x86_64/multiarch/memset-sse2-unaligned-erms.S | 16 --
> .../x86_64/multiarch/memset-vec-unaligned-erms.S | 32 ++--
> sysdeps/x86_64/multiarch/memset.S | 26 ++--
> sysdeps/x86_64/multiarch/memset_chk.S | 14 +-
> 10 files changed, 76 insertions(+), 333 deletions(-)
> delete mode 100644 sysdeps/x86_64/multiarch/memset-avx2.S
> delete mode 100644 sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
>
PING.
--
H.J.