This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] faster string operations for buldozer.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: libc-alpha at sourceware dot org
- Date: Wed, 26 Sep 2012 19:15:42 +0200
- Subject: [PATCH] faster string operations for buldozer.
Hello, when I added fx10 in my benchmarks I noticed that in
strlen etc. SSE4_2 variants are selected.
Difference between SSE4_2 and pminub variants is even bigger
than on ivy bridge as speed of pminub is almost identical while
stricmpi is 40% slower on fx10 than on ivy bridge.
---
ChangeLog | 5 +++++
sysdeps/x86_64/multiarch/init-arch.c | 3 +++
2 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 123f339..5277afb 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2012-09-26 Ondrej Bilka <neleai@seznam.cz>
+
+ * sysdeps/x86_64/multiarch/init_arch.c: Select faster string function
+ implementation for buldozer.
+
2012-09-26 Marek Polacek <polacek@redhat.com>
[BZ #14530]
diff --git a/sysdeps/x86_64/multiarch/init-arch.c
b/sysdeps/x86_64/multiarch/init-arch.c
index fb44dcf..b872e5f 100644
--- a/sysdeps/x86_64/multiarch/init-arch.c
+++ b/sysdeps/x86_64/multiarch/init-arch.c
@@ -131,6 +131,9 @@ __init_cpu_features (void)
__cpu_features.feature[index_Prefer_SSE_for_memop]
|= bit_Prefer_SSE_for_memop;
+ __cpu_features.feature[index_Fast_Rep_String]
+ |= ( bit_Prefer_PMINUB_for_stringop);
+
unsigned int eax;
__cpuid (0x80000000, eax, ebx, ecx, edx);
if (eax >= 0x80000001)
--
1.7.4.4