This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86-64: Update strlen.S to support wcslen/wcsnlen
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Markus Trippelsdorf <markus at trippelsdorf dot de>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 6 Jun 2017 05:37:32 -0700
- Subject: Re: [PATCH] x86-64: Update strlen.S to support wcslen/wcsnlen
- Authentication-results: sourceware.org; auth=none
- References: <20170520135619.GA17481@gmail.com> <CAMe9rOrik_jJkM+vd9n7LuPEUR=xM7T8MC1cMiFXgnPsYUiTEw@mail.gmail.com> <CAMe9rOroGR3vbxw_uRkNzhy25iwUocqneuj0zd2DFtAcw7Vy9w@mail.gmail.com> <20170606053710.GA283@x4>
On Mon, Jun 5, 2017 at 10:37 PM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2017.06.05 at 07:24 -0700, H.J. Lu wrote:
>> On Tue, May 30, 2017 at 3:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Sat, May 20, 2017 at 6:56 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> The difference between strlen and wcslen is byte vs int. We can
>> >> replace pminub and pcmpeqb with pminud and pcmpeqd to turn strlen
>> >> into wcslen. Tested on Ivy Bridge with benchtests/bench-wcslen.c,
>> >> the new strlen based wcslen is as fast as the old wcslen.
>> >>
>> >> OK for master?
>> >>
>> >> H.J.
>> >> ---
>> >> * sysdeps/x86_64/strlen.S (PMINU): New.
>> >> (PCMPEQ): Likewise.
>> >> (SHIFT_RETURN): Likewise.
>> >> (FIND_ZERO): Replace pcmpeqb with PCMPEQ.
>> >> (strlen): Add SHIFT_RETURN before ret. Replace pcmpeqb and
>> >> pminub with PCMPEQ and PMINU.
>> >> * sysdeps/x86_64/wcslen.S: Define AS_WCSLEN and strlen.
>> >> Include "strlen.S".
>> >> * sysdeps/x86_64/wcsnlen.S: New file.
>> >
>> > Here is the updated patch only with SSE2 wcsnlen.S. Any
>> > comments?
>> >
>>
>> I will check it in today.
>
> It doesn't work on old machines without SSE4.1:
>
> FAIL: stdio-common/tstdiomisc
> FAIL: wcsmbs/test-wcpncpy
> FAIL: wcsmbs/test-wcsncmp
> FAIL: wcsmbs/test-wcsncpy
> FAIL: wcsmbs/test-wcsnlen
> FAIL: wcsmbs/wcsatcliff
>
> markus@x4 tmp % gdb --args /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/gl[31/150]d/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glib
> c-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc
> Reading symbols from /var/tmp/glibc-build/elf/ld-linux-x86-64.so.2...done.
> (gdb) run
> Starting program: /home/markus/tmp/glibc-build/elf/ld-linux-x86-64.so.2 --library-path /var/tmp/glibc-build:/var/tmp/glibc-build/math:/var/tmp/glibc-build/elf:/var/tmp/glibc-build/dlfcn:/var/tmp/glibc-build/nss:/var/tmp/glibc-build/nis:/var/tmp/glibc-build/rt:/var/tmp/glibc-build/resolv:/var/tmp/glibc-build/crypt:/var/tmp/glibc-build/mathvec:/var/tmp/glibc-build/support:/var/tmp/glibc-build/nptl /var/tmp/glibc-build/stdio-common/tstdiomisc
> t1: count=5
> sscanf ("12345", "%ld", &x) => 1, x = 12345
> sscanf ("12345", "%llllld", &x) => 0, x = -1
> sscanf ("12345", "%LLLLLd", &x) => 0, x = -1
> sscanf ("test ", "%*s%n", &x) => 0, x = 4
> sscanf ("test ", "%2*s%n", &x) => 0, x = -1
> sscanf ("12 ", "%l2d", &x) => 0, x = -1
> sscanf ("12 ", "%2ld", &x) => 1, x = 12
> sscanf ("1 1", "%d %Z", &n, &N) => 1, n = 1, N = -1
> expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN"
> expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN"
> expected "nan NAN nan NAN nan NAN nan NAN", got "nan NAN nan NAN nan NAN nan NAN"
> expected "-nan -NAN -nan -NAN -nan -NAN -nan -NAN", got "-nan -NAN -nan -NAN -nan -NAN -nan -NAN" expected "inf INF inf INF inf INF inf INF", got "inf INF inf INF inf INF inf INF"
> expected "-inf -INF -inf -INF -inf -INF -inf -INF", got "-inf -INF -inf -INF -inf -INF -inf -INF"
> Program received signal SIGILL, Illegal instruction.
> wcsnlen () at ../sysdeps/x86_64/strlen.S:180
> 180 PMINU 16(%rax), %xmm0
>
Please try this. Sorry for the breakage.
--
H.J.
From b5b8fce19221891afba4907e9dd7e05b9e797f53 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 6 Jun 2017 05:31:48 -0700
Subject: [PATCH] x86-64: Move wcsnlen.S to multiarch/wcsnlen-sse4_1.S
Since wcsnlen.S uses pminud which is the part of SSE4.1, move wcsnlen.S
to multiarch/wcsnlen-sse4_1.S.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
wcsnlen-sse4_1 and wcsnlen-c.
* sysdeps/x86_64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Test __wcsnlen_sse4_1 and
__wcsnlen_sse2.
* sysdeps/x86_64/multiarch/ifunc-sse4_1.h: New file.
* sysdeps/x86_64/multiarch/wcsnlen-c.c: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S: Likewise.
* sysdeps/x86_64/multiarch/wcsnlen.c: Likewise.
* sysdeps/x86_64/wcsnlen.S: Removed.
---
sysdeps/x86_64/multiarch/Makefile | 3 ++-
sysdeps/x86_64/multiarch/ifunc-impl-list.c | 7 ++++++
sysdeps/x86_64/multiarch/ifunc-sse4_1.h | 34 ++++++++++++++++++++++++++++++
sysdeps/x86_64/multiarch/wcsnlen-c.c | 9 ++++++++
sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S | 5 +++++
sysdeps/x86_64/multiarch/wcsnlen.c | 31 +++++++++++++++++++++++++++
sysdeps/x86_64/wcsnlen.S | 7 ------
7 files changed, 88 insertions(+), 8 deletions(-)
create mode 100644 sysdeps/x86_64/multiarch/ifunc-sse4_1.h
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-c.c
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S
create mode 100644 sysdeps/x86_64/multiarch/wcsnlen.c
delete mode 100644 sysdeps/x86_64/wcsnlen.S
diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile
index b040288..310a3a4 100644
--- a/sysdeps/x86_64/multiarch/Makefile
+++ b/sysdeps/x86_64/multiarch/Makefile
@@ -33,7 +33,8 @@ endif
ifeq ($(subdir),wcsmbs)
sysdep_routines += wmemcmp-sse4 wmemcmp-ssse3 wmemcmp-c \
wmemcmp-avx2-movbe \
- wcscpy-ssse3 wcscpy-c
+ wcscpy-ssse3 wcscpy-c \
+ wcsnlen-sse4_1 wcsnlen-c
endif
ifeq ($(subdir),debug)
diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
index b61bc9f..ee4243a 100644
--- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
@@ -296,6 +296,13 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__wcscpy_ssse3)
IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_sse2))
+ /* Support sysdeps/x86_64/multiarch/wcsnlen.c. */
+ IFUNC_IMPL (i, name, wcsnlen,
+ IFUNC_IMPL_ADD (array, i, wcsnlen,
+ HAS_CPU_FEATURE (SSE4_1),
+ __wcsnlen_sse4_1)
+ IFUNC_IMPL_ADD (array, i, wcsnlen, 1, __wcsnlen_sse2))
+
/* Support sysdeps/x86_64/multiarch/wmemcmp.S. */
IFUNC_IMPL (i, name, wmemcmp,
IFUNC_IMPL_ADD (array, i, wmemcmp,
diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_1.h b/sysdeps/x86_64/multiarch/ifunc-sse4_1.h
new file mode 100644
index 0000000..2b89231
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/ifunc-sse4_1.h
@@ -0,0 +1,34 @@
+/* Common definition for ifunc selections optimized with SSE2 and SSE4.1.
+ All versions must be listed in ifunc-impl-list.c.
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <init-arch.h>
+
+extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden;
+extern __typeof (REDIRECT_NAME) OPTIMIZE (sse4_1) attribute_hidden;
+
+static inline void *
+IFUNC_SELECTOR (void)
+{
+ const struct cpu_features* cpu_features = __get_cpu_features ();
+
+ if (CPU_FEATURES_CPU_P (cpu_features, SSE4_1))
+ return OPTIMIZE (sse4_1);
+
+ return OPTIMIZE (sse2);
+}
diff --git a/sysdeps/x86_64/multiarch/wcsnlen-c.c b/sysdeps/x86_64/multiarch/wcsnlen-c.c
new file mode 100644
index 0000000..e1ec7cf
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/wcsnlen-c.c
@@ -0,0 +1,9 @@
+#if IS_IN (libc)
+# include <wchar.h>
+
+# define WCSNLEN __wcsnlen_sse2
+
+extern __typeof (wcsnlen) __wcsnlen_sse2;
+#endif
+
+#include "wcsmbs/wcsnlen.c"
diff --git a/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S b/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S
new file mode 100644
index 0000000..a8cab0c
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/wcsnlen-sse4_1.S
@@ -0,0 +1,5 @@
+#define AS_WCSLEN
+#define AS_STRNLEN
+#define strlen __wcsnlen_sse4_1
+
+#include "../strlen.S"
diff --git a/sysdeps/x86_64/multiarch/wcsnlen.c b/sysdeps/x86_64/multiarch/wcsnlen.c
new file mode 100644
index 0000000..5f74d2c
--- /dev/null
+++ b/sysdeps/x86_64/multiarch/wcsnlen.c
@@ -0,0 +1,31 @@
+/* Multiple versions of wcsnlen.
+ All versions must be listed in ifunc-impl-list.c.
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* Define multiple versions only for the definition in libc. */
+#if IS_IN (libc)
+# define __wcsnlen __redirect_wcsnlen
+# include <wchar.h>
+# undef __wcsnlen
+
+# define SYMBOL_NAME wcsnlen
+# include "ifunc-sse4_1.h"
+
+libc_ifunc_redirected (__redirect_wcsnlen, __wcsnlen, IFUNC_SELECTOR ());
+weak_alias (__wcsnlen, wcsnlen);
+#endif
diff --git a/sysdeps/x86_64/wcsnlen.S b/sysdeps/x86_64/wcsnlen.S
deleted file mode 100644
index 968bb69..0000000
--- a/sysdeps/x86_64/wcsnlen.S
+++ /dev/null
@@ -1,7 +0,0 @@
-#define AS_WCSLEN
-#define AS_STRNLEN
-#define strlen __wcsnlen
-
-#include "strlen.S"
-
-weak_alias(__wcsnlen, wcsnlen)
--
2.9.4