This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PATCH: Add 32bit SSE2 strlen
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Ulrich Drepper <drepper at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 4 Aug 2009 17:08:08 -0700
- Subject: Re: PATCH: Add 32bit SSE2 strlen
- References: <20090804204810.GA5273@lucon.org> <4A78C501.7000801@redhat.com>
On Tue, Aug 4, 2009 at 4:32 PM, Ulrich Drepper<drepper@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> H.J. Lu wrote:
>> I added slow_vector to avoid SSE
>> vector instructions on Atom instead of disabling feature bits so that
>> SSE can be used in selected functions.
>
> Why? ?Give me a real reason, especially for the 64-bit code. ?This
> solution adds code for no reason. ?In which situation do you want to use
> SSSE3 on Atom?
>
In 32bit SSE2 strlen, there are
1: leal __strlen_ia32@GOTOFF(%ebx), %eax
testl $(1<<26), CPUID_OFFSET+COMMON_CPUID_INDEX_1*CPUID_SIZE+CPUID_EDX
_OFFSET+__cpu_features@GOTOFF(%ebx)
jz 2f
cmpl $1, SLOW_VECTOR_OFFSET+__cpu_features@GOTOFF(%ebx)
je 2f
leal __strlen_sse2@GOTOFF(%ebx), %eax
2: popl %ebx
I don't want to totally disable SSE2 on Atom. So I added slow_vector. I can
limit it to 32bit and leave 64bit alone.
--
H.J.