This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH x86_64] Fix for wrong selector in x86_64/multiarch/memcpy.S BZ #18880
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Pawar, Amit" <Amit dot Pawar at amd dot com>
- Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Fri, 4 Mar 2016 08:07:22 -0800
- Subject: Re: [PATCH x86_64] Fix for wrong selector in x86_64/multiarch/memcpy.S BZ #18880
- Authentication-results: sourceware.org; auth=none
- References: <SN1PR12MB07331EDDADC49420ACBB5AEA97BD0 at SN1PR12MB0733 dot namprd12 dot prod dot outlook dot com> <CAMe9rOq2Oob93Y0=8EFE-kzRQQ4zgy_58RXjhe6quk30ELGqzw at mail dot gmail dot com> <SN1PR12MB0733B721F25699F6A280FA9F97BD0 at SN1PR12MB0733 dot namprd12 dot prod dot outlook dot com>
On Thu, Mar 3, 2016 at 9:04 AM, Pawar, Amit <Amit.Pawar@amd.com> wrote:
>>Change looks good. If you can't commit it yourself, please improve commit
>>log:
>>
>>1. Don't add your ChangLog entry in ChangeLog directly since other people may change ChangeLog.
>>2. In ChangeLog entry, describe what you did, like check Fast_Unaligned_Load instead of Slow_BSF and >check Fast_Copy_Backward for __memcpy_ssse3_back.
>
> As per your suggestion, I have fixed the patch with improved commit log and also providing separate ChangeLog patch. If OK please commit it else let me know for any required changes.
>
> Thanks,
> Amit Pawar
>
>
This is the patch I am going to check in.
--
H.J.
From 2b4fee345d53eb8fc81461f2aefae74e9f3604ae Mon Sep 17 00:00:00 2001
From: Amit Pawar <Amit.Pawar@amd.com>
Date: Thu, 3 Mar 2016 22:24:21 +0530
Subject: [PATCH] x86-64: Fix memcpy IFUNC selection
Chek Fast_Unaligned_Load, instead of Slow_BSF, and also check for
Fast_Copy_Backward to enable __memcpy_ssse3_back. Existing selection
order is updated with following selection order:
1. __memcpy_avx_unaligned if AVX_Fast_Unaligned_Load bit is set.
2. __memcpy_sse2_unaligned if Fast_Unaligned_Load bit is set.
3. __memcpy_sse2 if SSSE3 isn't available.
4. __memcpy_ssse3_back if Fast_Copy_Backward bit it set.
5. __memcpy_ssse3
[BZ #18880]
* sysdeps/x86_64/multiarch/memcpy.S: Check Fast_Unaligned_Load
instead of Slow_BSF and also check for Fast_Copy_Backward to
enable __memcpy_ssse3_back.
---
sysdeps/x86_64/multiarch/memcpy.S | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/sysdeps/x86_64/multiarch/memcpy.S b/sysdeps/x86_64/multiarch/memcpy.S
index 64a1bcd..8882590 100644
--- a/sysdeps/x86_64/multiarch/memcpy.S
+++ b/sysdeps/x86_64/multiarch/memcpy.S
@@ -35,22 +35,23 @@ ENTRY(__new_memcpy)
jz 1f
HAS_ARCH_FEATURE (Prefer_No_VZEROUPPER)
jz 1f
- leaq __memcpy_avx512_no_vzeroupper(%rip), %rax
+ lea __memcpy_avx512_no_vzeroupper(%rip), %RAX_LP
ret
#endif
-1: leaq __memcpy_avx_unaligned(%rip), %rax
+1: lea __memcpy_avx_unaligned(%rip), %RAX_LP
HAS_ARCH_FEATURE (AVX_Fast_Unaligned_Load)
- jz 2f
- ret
-2: leaq __memcpy_sse2(%rip), %rax
- HAS_ARCH_FEATURE (Slow_BSF)
- jnz 3f
- leaq __memcpy_sse2_unaligned(%rip), %rax
- ret
-3: HAS_CPU_FEATURE (SSSE3)
- jz 4f
- leaq __memcpy_ssse3(%rip), %rax
-4: ret
+ jnz 2f
+ lea __memcpy_sse2_unaligned(%rip), %RAX_LP
+ HAS_ARCH_FEATURE (Fast_Unaligned_Load)
+ jnz 2f
+ lea __memcpy_sse2(%rip), %RAX_LP
+ HAS_CPU_FEATURE (SSSE3)
+ jz 2f
+ lea __memcpy_ssse3_back(%rip), %RAX_LP
+ HAS_ARCH_FEATURE (Fast_Copy_Backward)
+ jnz 2f
+ lea __memcpy_ssse3(%rip), %RAX_LP
+2: ret
END(__new_memcpy)
# undef ENTRY
--
2.5.0