This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Date: Thu, 14 Apr 2016 05:03:10 -0700
- Subject: Re: PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets
- Authentication-results: sourceware.org; auth=none
- References: <CAMe9rOrNQV-ALszNj1kJaEYsDeqs3KPFC3WMcVs8sO7zcZrVrg at mail dot gmail dot com>
On Tue, Apr 12, 2016 at 8:34 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Mar 31, 2016 at 12:42 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> Since the new SSE2/AVX2 memsets are faster than the previous ones, we
>> can remove the previous SSE2/AVX2 memsets and replace them with the
>> new ones.
>>
>> No change in IFUNC selection if SSE2 and AVX2 memsets weren't used
>> before. If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset
>> optimized with Enhanced REP STOSB will be used for processors with
>> ERMS.
>>
>> Tested on Penryn, Westmere, Ivy Bridge and Haswell with and without
>> --disable-multi-arch. OK for master?
>>
>> H.J.
>> ---
>> [BZ #19881]
>> * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded
>> into ...
>> * sysdeps/x86_64/memset.S: This.
>> (__bzero): Removed.
>> (__memset_tail): Likewise.
>> (__memset_chk): Likewise.
>> (memset): Likewise.
>> (MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't
>> defined.
>> (MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined.
>> * sysdeps/x86_64/multiarch/memset-avx2.S: Removed.
>> (__memset_zero_constant_len_parameter): Check SHARED instead of
>> PIC.
>> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
>> memset-avx2 and memset-sse2-unaligned-erms.
>> * sysdeps/x86_64/multiarch/ifunc-impl-list.c
>> (__libc_ifunc_impl_list): Remove __memset_chk_sse2,
>> __memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned.
>> * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: Skip
>> if not in libc.
>> * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
>> Likewise.
>> * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
>> (MEMSET_CHK_SYMBOL): New. Define if not defined.
>> (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH.
>> Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk
>> symbols.
>> Properly check USE_MULTIARCH on __memset symbols.
>> * sysdeps/x86_64/multiarch/memset.S (memset): Replace
>> __memset_sse2 and __memset_avx2 with __memset_sse2_unaligned
>> and __memset_avx2_unaligned. Use __memset_sse2_unaligned_erms
>> or __memset_avx2_unaligned_erms if processor has ERMS.
>> (memset): Removed.
>> (__memset_chk): Likewise.
>> (MEMSET_SYMBOL): New.
>> (libc_hidden_builtin_def): Replace __memset_sse2 with
>> __memset_sse2_unaligned.
>> * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace
>> __memset_chk_sse2 and __memset_chk_avx2 with
>> __memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms.
>> Use __memset_chk_sse2_unaligned_erms or
>> __memset_chk_avx2_unaligned_erms if processor has ERMS.
>> ---
>> sysdeps/x86_64/memset.S | 121 +++------------
>> sysdeps/x86_64/multiarch/Makefile | 3 +-
>> sysdeps/x86_64/multiarch/ifunc-impl-list.c | 9 --
>> .../x86_64/multiarch/memset-avx2-unaligned-erms.S | 18 ++-
>> sysdeps/x86_64/multiarch/memset-avx2.S | 168 ---------------------
>> .../multiarch/memset-avx512-unaligned-erms.S | 2 +-
>> .../x86_64/multiarch/memset-sse2-unaligned-erms.S | 16 --
>> .../x86_64/multiarch/memset-vec-unaligned-erms.S | 32 ++--
>> sysdeps/x86_64/multiarch/memset.S | 26 ++--
>> sysdeps/x86_64/multiarch/memset_chk.S | 14 +-
>> 10 files changed, 76 insertions(+), 333 deletions(-)
>> delete mode 100644 sysdeps/x86_64/multiarch/memset-avx2.S
>> delete mode 100644 sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
>>
>
> PING.
>
Any comments, feedbacks? Objections?
--
H.J.