This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, Apr 13, 2010 at 7:11 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Tue, Apr 13, 2010 at 6:59 AM, Ulrich Drepper <drepper@redhat.com> wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 04/12/2010 03:40 PM, H.J. Lu wrote: >>> This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X >>> on Intel Core i7. >> >> I don't see any SSE4.2 instructions being used. ?ptest is an SSE4.1 >> instruction. > > You are right. It is SSE4.1. > >> Also, your Makefile change contains references to a whole set of other >> files which haven't yet been submitted and therefore the patch doesn't >> apply. >> > > We are working on SSSE3 optimized memcpy. It is not ready yet. > > Please ignore it. I will submit a new one. > Here is the patch to optimize for unaligned data. Tested on Core i7 and Core 2. It improves performance by up to another 100%. Thanks. -- H.J. --- 2010-04-14 H.J. Lu <hongjiu.lu@intel.com> * sysdeps/x86_64/multiarch/memcmp-sse4.S: Optimized for unaligned data.
Attachment:
libc-memcmp-64-sse4-2.patch
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |