This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets


On Tue, Apr 12, 2016 at 8:34 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Mar 31, 2016 at 12:42 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> Since the new SSE2/AVX2 memsets are faster than the previous ones, we
>> can remove the previous SSE2/AVX2 memsets and replace them with the
>> new ones.
>>
>> No change in IFUNC selection if SSE2 and AVX2 memsets weren't used
>> before.  If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset
>> optimized with Enhanced REP STOSB will be used for processors with
>> ERMS.
>>
>> Tested on Penryn, Westmere, Ivy Bridge and Haswell with and without
>> --disable-multi-arch.  OK for master?
>>
>> H.J.
>> ---
>>         [BZ #19881]
>>         * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded
>>         into ...
>>         * sysdeps/x86_64/memset.S: This.
>>         (__bzero): Removed.
>>         (__memset_tail): Likewise.
>>         (__memset_chk): Likewise.
>>         (memset): Likewise.
>>         (MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't
>>         defined.
>>         (MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined.
>>         * sysdeps/x86_64/multiarch/memset-avx2.S: Removed.
>>         (__memset_zero_constant_len_parameter): Check SHARED instead of
>>         PIC.
>>         * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
>>         memset-avx2 and memset-sse2-unaligned-erms.
>>         * sysdeps/x86_64/multiarch/ifunc-impl-list.c
>>         (__libc_ifunc_impl_list): Remove __memset_chk_sse2,
>>         __memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned.
>>         * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: Skip
>>         if not in libc.
>>         * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
>>         Likewise.
>>         * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
>>         (MEMSET_CHK_SYMBOL): New.  Define if not defined.
>>         (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH.
>>         Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk
>>         symbols.
>>         Properly check USE_MULTIARCH on __memset symbols.
>>         * sysdeps/x86_64/multiarch/memset.S (memset): Replace
>>         __memset_sse2 and __memset_avx2 with __memset_sse2_unaligned
>>         and __memset_avx2_unaligned.  Use __memset_sse2_unaligned_erms
>>         or __memset_avx2_unaligned_erms if processor has ERMS.
>>         (memset): Removed.
>>         (__memset_chk): Likewise.
>>         (MEMSET_SYMBOL): New.
>>         (libc_hidden_builtin_def): Replace __memset_sse2 with
>>         __memset_sse2_unaligned.
>>         * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace
>>         __memset_chk_sse2 and __memset_chk_avx2 with
>>         __memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms.
>>         Use __memset_chk_sse2_unaligned_erms or
>>         __memset_chk_avx2_unaligned_erms if processor has ERMS.
>> ---
>>  sysdeps/x86_64/memset.S                            | 121 +++------------
>>  sysdeps/x86_64/multiarch/Makefile                  |   3 +-
>>  sysdeps/x86_64/multiarch/ifunc-impl-list.c         |   9 --
>>  .../x86_64/multiarch/memset-avx2-unaligned-erms.S  |  18 ++-
>>  sysdeps/x86_64/multiarch/memset-avx2.S             | 168 ---------------------
>>  .../multiarch/memset-avx512-unaligned-erms.S       |   2 +-
>>  .../x86_64/multiarch/memset-sse2-unaligned-erms.S  |  16 --
>>  .../x86_64/multiarch/memset-vec-unaligned-erms.S   |  32 ++--
>>  sysdeps/x86_64/multiarch/memset.S                  |  26 ++--
>>  sysdeps/x86_64/multiarch/memset_chk.S              |  14 +-
>>  10 files changed, 76 insertions(+), 333 deletions(-)
>>  delete mode 100644 sysdeps/x86_64/multiarch/memset-avx2.S
>>  delete mode 100644 sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
>>
>
> PING.
>

Any comments, feedbacks?  Objections?


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]