This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: 64bit SSE4 optimized memcmp


On Tue, Apr 13, 2010 at 7:11 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 13, 2010 at 6:59 AM, Ulrich Drepper <drepper@redhat.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 04/12/2010 03:40 PM, H.J. Lu wrote:
>>> This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X
>>> on Intel Core i7.
>>
>> I don't see any SSE4.2 instructions being used. ?ptest is an SSE4.1
>> instruction.
>
> You are right. It is SSE4.1.
>
>> Also, your Makefile change contains references to a whole set of other
>> files which haven't yet been submitted and therefore the patch doesn't
>> apply.
>>
>
> We are working on SSSE3 optimized memcpy. It is not ready yet.
>
> Please ignore it. I will submit a new one.
>

Here is the patch to optimize for unaligned data. Tested on Core i7
and Core 2. It improves performance by up to another 100%.

Thanks.

-- 
H.J.
---
2010-04-14  H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/x86_64/multiarch/memcmp-sse4.S: Optimized for unaligned
	data.

Attachment: libc-memcmp-64-sse4-2.patch
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]