This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] vectorized string functions


Ondrej,

>> +sysdep_routines += strnlen strnlen_sse2 strnlen_ssse3 strnlen_sse4_1
>> +  CFLAGS-strnlen_ssse3.c  += -mssse3
>> +  CFLAGS-strnlen_sse4_1.c  += -msse4

It seems to me that sometimes you produces too many versions.

Strnlen example:
Objdump shows strnlen_sse2 and strnlen_ssse3 are exactly the same. (No
any SSSE3 instruction GCC compiler generates)
strnlen_sse4_1 differs from others only with ptest instead of pmovmskb
+ testl pair but it's known that this almost no affect performance but
we've got IFUNC wrapper overhead.

>> delete mode 100644 sysdeps/x86_64/multiarch/strnlen-sse2-no-bsf.S

And we should check regressions on atom machine before removing no_bsf
atom specific version.

--
Liubov Dmitrieva
Intel Corporation


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]