This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PING: [PATCH] X86-64: Remove the previous SSE2/AVX2 memsets


On Thu, Mar 31, 2016 at 12:42 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> Since the new SSE2/AVX2 memsets are faster than the previous ones, we
> can remove the previous SSE2/AVX2 memsets and replace them with the
> new ones.
>
> No change in IFUNC selection if SSE2 and AVX2 memsets weren't used
> before.  If SSE2 or AVX2 memset was used, the new SSE2 or AVX2 memset
> optimized with Enhanced REP STOSB will be used for processors with
> ERMS.
>
> Tested on Penryn, Westmere, Ivy Bridge and Haswell with and without
> --disable-multi-arch.  OK for master?
>
> H.J.
> ---
>         [BZ #19881]
>         * sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S: Folded
>         into ...
>         * sysdeps/x86_64/memset.S: This.
>         (__bzero): Removed.
>         (__memset_tail): Likewise.
>         (__memset_chk): Likewise.
>         (memset): Likewise.
>         (MEMSET_CHK_SYMBOL): New. Define only if MEMSET_SYMBOL isn't
>         defined.
>         (MEMSET_SYMBOL): Define only if MEMSET_SYMBOL isn't defined.
>         * sysdeps/x86_64/multiarch/memset-avx2.S: Removed.
>         (__memset_zero_constant_len_parameter): Check SHARED instead of
>         PIC.
>         * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove
>         memset-avx2 and memset-sse2-unaligned-erms.
>         * sysdeps/x86_64/multiarch/ifunc-impl-list.c
>         (__libc_ifunc_impl_list): Remove __memset_chk_sse2,
>         __memset_chk_avx2, __memset_sse2 and __memset_avx2_unaligned.
>         * sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S: Skip
>         if not in libc.
>         * sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S:
>         Likewise.
>         * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
>         (MEMSET_CHK_SYMBOL): New.  Define if not defined.
>         (__bzero): Check VEC_SIZE == 16 instead of USE_MULTIARCH.
>         Replace MEMSET_SYMBOL with MEMSET_CHK_SYMBOL on __memset_chk
>         symbols.
>         Properly check USE_MULTIARCH on __memset symbols.
>         * sysdeps/x86_64/multiarch/memset.S (memset): Replace
>         __memset_sse2 and __memset_avx2 with __memset_sse2_unaligned
>         and __memset_avx2_unaligned.  Use __memset_sse2_unaligned_erms
>         or __memset_avx2_unaligned_erms if processor has ERMS.
>         (memset): Removed.
>         (__memset_chk): Likewise.
>         (MEMSET_SYMBOL): New.
>         (libc_hidden_builtin_def): Replace __memset_sse2 with
>         __memset_sse2_unaligned.
>         * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Replace
>         __memset_chk_sse2 and __memset_chk_avx2 with
>         __memset_chk_sse2_unaligned and __memset_chk_avx2_unaligned_erms.
>         Use __memset_chk_sse2_unaligned_erms or
>         __memset_chk_avx2_unaligned_erms if processor has ERMS.
> ---
>  sysdeps/x86_64/memset.S                            | 121 +++------------
>  sysdeps/x86_64/multiarch/Makefile                  |   3 +-
>  sysdeps/x86_64/multiarch/ifunc-impl-list.c         |   9 --
>  .../x86_64/multiarch/memset-avx2-unaligned-erms.S  |  18 ++-
>  sysdeps/x86_64/multiarch/memset-avx2.S             | 168 ---------------------
>  .../multiarch/memset-avx512-unaligned-erms.S       |   2 +-
>  .../x86_64/multiarch/memset-sse2-unaligned-erms.S  |  16 --
>  .../x86_64/multiarch/memset-vec-unaligned-erms.S   |  32 ++--
>  sysdeps/x86_64/multiarch/memset.S                  |  26 ++--
>  sysdeps/x86_64/multiarch/memset_chk.S              |  14 +-
>  10 files changed, 76 insertions(+), 333 deletions(-)
>  delete mode 100644 sysdeps/x86_64/multiarch/memset-avx2.S
>  delete mode 100644 sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
>

PING.


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]