This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: 2.25 freeze status
On Friday 27 January 2017 11:27 PM, H.J. Lu wrote:
> I am testing this patch for
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=21081
>
> I'd like to check it in before code freeze.
>
>
> 0001-Add-VZEROUPPER-to-memset-vec-unaligned-erms.S-BZ-210.patch
>
>
> From 9097edb85e04c137f226f3d371afff34a4ab17b7 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Date: Tue, 24 Jan 2017 15:58:49 -0800
> Subject: [PATCH] Add VZEROUPPER to memset-vec-unaligned-erms.S [BZ #21081]
>
> Since memset-vec-unaligned-erms.S has VDUP_TO_VEC0_AND_SET_RETURN at
> function entry, memset optimized for AVX2 and AVX512 will always use
> ymm/zmm register. VZEROUPPER should be placed before ret in
>
> L(stosb):
> movq %rdx, %rcx
> movzbl %sil, %eax
> movq %rdi, %rdx
> rep stosb
> movq %rdx, %rax
> ret
>
> since it can be reached from
>
> L(stosb_more_2x_vec):
> cmpq $REP_STOSB_THRESHOLD, %rdx
> ja L(stosb)
>
> [BZ #21081]
> * sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> (L(stosb)): Add VZEROUPPER before ret.
> ---
> sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> index ff214f0..704eed9 100644
> --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
> @@ -110,6 +110,8 @@ ENTRY (__memset_erms)
> ENTRY (MEMSET_SYMBOL (__memset, erms))
> # endif
> L(stosb):
> + /* Issue vzeroupper before rep stosb. */
> + VZEROUPPER
> movq %rdx, %rcx
> movzbl %sil, %eax
> movq %rdi, %rdx
>
Looks good to me.
Siddhesh