This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Testing 2.16 release candidate with gnulib - quadratic behaviourdetected in SSE42 strstr?


Hello,

Yes, sse42 strstr shows definitely quadratic time in your example, so
2 times decreasing / increasing  of m gives 4 times impact on an
execution time.

This sse42 strstr algorithm was announced here
http://sources.redhat.com/ml/libc-alpha/2009-07/msg00052.html as
Knuth–Morris–Pratt algorithm, but correct  KMP has linear time
estimation at worst and can't give quadratic complexity at all.

So, for my machine sse42 strstr without alarm gives:

real    2m17.983s
user    2m16.951s
sys     0m0.107s

And I also have correct KMP algorithm (attached for information)
written in plain C and it easily passes configure test with 5 sec
alarm. :

real    0m0.015s
user    0m0.011s
sys     0m0.004s

Anyway sse42 strstr looks rather difficult and I can't say at the
moment if it's possible to make it always linear like KMP.
More time for investigation is required here.

Should I file a bug or anybody already did it?

--
Liubov Dmitireva
Software Engineer
Intel Corporation


> ---------- Forwarded message ----------
> From: "Carlos O'Donell" <carlos_odonell@mentor.com>
> Date: Jun 27, 2012 9:14 PM
> Subject: Testing 2.16 release candidate with gnulib - quadratic behaviour
> detected in SSE42 strstr?
> To: "libc-alpha" <libc-alpha@sourceware.org>, "Bruno Haible"
> <bruno@clisp.org>
>
> Community,
>
> http://sourceware.org/glibc/wiki/Testing/Gnulib
>
> Running the 2.16 release candidate against the gnulib
> testsuite has turned up three object files being build
> by gnulib that aren't listed in the known issues:
>
> (a) logf
> (b) log10f
> (c) remove
> (d) strstr
>
> I've run out of time just getting the build bits setup
> and reviewing 2.15 backports. I'm hoping that some of
> my European friends can look at these as the earth turns
> to their part of the world...
>
> Initial analysis:
>
> (a) logf
> (b) log10f
>
> Why wasn't -lm added to the link?
>
> This looks like an environment/configuration issue with gnulib.
>
> Adding `-lm' to CC fixes this, but that can't be right.
>
> configure:84449: checking whether log10f works according to ISO C 99 with
> IEC 60559
> configure:84504: gcc
> -Wl,-rpath=/scratch/carloso/build4-lucid-cs/install//lib64:/scratch/carloso/build4-lucid-cs/install//usr/lib64
> -Wl,--dynamic-linker=/scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> -std=gnu99 -o conftest -g -O2 -nostdinc
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include-fixed
> -I/scratch/carloso/build4-lucid-cs/install//usr/include
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include
> -Wall  conftest.c   >&5
> /tmp/cceWScoH.o: In function `main':
> /tmp/testdir/conftest.c:632: undefined reference to `log10f'
> collect2: ld returned 1 exit status
> configure:84508: $? = 1
> configure: program exited with status 1
> configure: failed program was:
> ...
> configure:84537: result: no
>
> (c) remove
>
> This function is added automatically.
>
> e.g.
> ~~~ config.log ~~~
> gl_LIBOBJS=' asnprintf.o chdir-long.o dprintf.o fclose.o fcntl.o fflush.o
> fprintf.o fpurge.o fseek.o fseeko.o fseterr.o futimens.o glob.o ioctl.o
> isfinite.o isnand.o isnanf.o isnanl.o linkat.o nanosleep.o openat-proc.o
> printf.o printf-args.o printf-parse.o remove.o snprintf.o sprintf.o
> strerror_r.o strstr.o utimensat.o vasnprintf.o vdprintf.o vfprintf.o
> vprintf.o vsnprintf.o vsprintf.o'
> gl_LTLIBOBJS=' asnprintf.lo chdir-long.lo dprintf.lo fclose.lo fcntl.lo
> fflush.lo fprintf.lo fpurge.lo fseek.lo fseeko.lo fseterr.lo futimens.lo
> glob.lo ioctl.lo isfinite.lo isnand.lo isnanf.lo isnanl.lo linkat.lo
> nanosleep.lo openat-proc.lo printf.lo printf-args.lo printf-parse.lo
> remove.lo snprintf.lo sprintf.lo strerror_r.lo strstr.lo utimensat.lo
> vasnprintf.lo vdprintf.lo vfprintf.lo vprintf.lo vsnprintf.lo vsprintf.lo'
> ~~~
>
> There is no mention of this in the known issues list here:
> http://sourceware.org/glibc/wiki/Testing/Gnulib
>
> (d) strstr
>
> This is the one that worries me.
>
> configure:121073: checking whether strstr works in linear time
> configure:121161: gcc -lm
> -Wl,-rpath=/scratch/carloso/build4-lucid-cs/install//lib64:/scratch/carloso/build4-lucid-cs/install//usr/lib64
> -Wl,--dynamic-linker=/scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> -std=gnu99 -o conftest -g -O2 -nostdinc
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include-fixed
> -I/scratch/carloso/build4-lucid-cs/install//usr/include
> -I/usr/local/tools/gcc-4.3.3/lib/gcc/x86_64-unknown-linux-gnu/4.3.2/include
> -Wall  conftest.c  >&5
> configure:121165: $? = 0
> configure:121171: ./conftest
> configure:121175: $? = 142
> configure: program exited with status 142
> configure: failed program was:
> | /* confdefs.h.  */
> ...
> | /* end confdefs.h.  */
> |
> | #include <signal.h> /* for signal */
> | #include <string.h> /* for strstr */
> | #include <stdlib.h> /* for malloc */
> | #include <unistd.h> /* for alarm */
> | static void quit (int sig) { exit (sig + 128); }
> |
> | int
> | main ()
> | {
> |
> |     int result = 0;
> |     size_t m = 1000000;
> |     char *haystack = (char *) malloc (2 * m + 2);
> |     char *needle = (char *) malloc (m + 2);
> |     /* Failure to compile this test due to missing alarm is okay,
> |        since all such platforms (mingw) also have quadratic strstr.  */
> |     signal (SIGALRM, quit);
> |     alarm (5);
> |     /* Check for quadratic performance.  */
> |     if (haystack && needle)
> |       {
> |         memset (haystack, 'A', 2 * m);
> |         haystack[2 * m] = 'B';
> |         haystack[2 * m + 1] = 0;
> |         memset (needle, 'A', m);
> |         needle[m] = 'B';
> |         needle[m + 1] = 0;
> |         if (!strstr (haystack, needle))
> |           result |= 1;
> |       }
> |     return result;
> |
> |   ;
> |   return 0;
> | }
> configure:121193: result: no
>
> This looks like it might be a real bug.
>
> Removing the alarm(5); and attaching to the long running process shows:
>
> GNU gdb (Sourcery CodeBench 2012.09-1 - Preview) 7.2.50.20100908-cvs
> Copyright (C) 2010 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu".
> For bug reporting instructions, please see:
> <https://support.codesourcery.com/GNUToolchain/>.
> (gdb) attach 8102
> Attaching to process 8102
> Reading symbols from /tmp/testdir/conftest...done.
> warning: Could not load shared library symbols for linux-vdso.so.1.
> Do you need "set solib-search-path" or "set sysroot"?
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/libm.so.6...done.
> Loaded symbols for /scratch/carloso/build4-lucid-cs/install//lib64/libm.so.6
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/libc.so.6...done.
> Loaded symbols for /scratch/carloso/build4-lucid-cs/install//lib64/libc.so.6
> Reading symbols from
> /scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for
> /scratch/carloso/build4-lucid-cs/install//lib64/ld-linux-x86-64.so.2
> 0x00007fcc57c6c272 in __m128i_strloadu (s1=0x7fcc582c625d 'A' <repeats 200
> times>..., s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:92
> 92        if (__builtin_expect ((int) ((size_t) p & 0xfff) > 0xff0, 0))
> (gdb) bt
> #0  0x00007fcc57c6c272 in __m128i_strloadu (s1=0x7fcc582c625d 'A' <repeats
> 200 times>..., s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:92
> #1  __strstr_sse42 (s1=0x7fcc582c625d 'A' <repeats 200 times>...,
> s2=0x7fcc57a56010 'A' <repeats 200 times>...) at
> ../sysdeps/x86_64/multiarch/strstr.c:317
> #2  0x0000000000400734 in main () at conftest.c:1019
> (gdb) c
> Continuing.
>
> Does this mean we've picked up quadratic behaviour in the supposedly fast
> SSE42 optimized strstr? :-(
>
> The test finishes in an awesome 5 minutes :-)
>
> carloso@build4-lucid-cs:/tmp/testdir$ time ./conftest
> real    5m22.318s
> user    4m21.060s
> sys     0m0.000s
>
> Could someone please verify this?
>
> Cheers,
> Carlos.
> --
> Carlos O'Donell
> Mentor Graphics / CodeSourcery
> carlos_odonell@mentor.com
> carlos@codesourcery.com
> +1 (613) 963 1026
>
#include <stdlib.h>
#include <string.h>

static int *compute_prefix_function(char *pattern, int psize)
{
	int k = -1;
	int i = 1;
	int *pi = malloc(sizeof(int)*psize);
	if (!pi)
		return NULL;

	pi[0] = k;
	for (i = 1; i < psize; i++) {
		while (k > -1 && pattern[k+1] != pattern[i])
			k = pi[k];
		if (pattern[i] == pattern[k+1])
			k++;
		pi[i] = k;
	}
	return pi;
}

static int kmp(char *target, int tsize, char *pattern, int psize)
{
	int i;
	int *pi = compute_prefix_function(pattern, psize);
	int k = -1;
	if (!pi)
		return -1;
	for (i = 0; i < tsize; i++) {
		while (k > -1 && pattern[k+1] != target[i])
			k = pi[k];
		if (target[i] == pattern[k+1])
			k++;
		if (k == psize - 1) {
			free(pi);
			return i-k;
		}
	}
	free(pi);
	return -1;
}

char *strstr_kmp  (char * haystack, char * needle) {

   int index = kmp(haystack, strlen(haystack), needle, strlen(needle));
   if (index == -1) return NULL;
   return haystack + index;
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]