This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [COMMITED] faster memcpy on x64.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
- Cc: Andreas Jaeger <aj at suse dot com>, GNU C Library <libc-alpha at sourceware dot org>, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Thu, 29 Aug 2013 17:01:42 +0200
- Subject: Re: [COMMITED] faster memcpy on x64.
- Authentication-results: sourceware.org; auth=none
- References: <20130427221620 dot GA16537 at domone dot kolej dot mff dot cuni dot cz> <518BB251 dot 7040602 at suse dot com> <CAHjhQ93bxAexzbymP6GN-08wLiu9mdxf2MoCXgqA1v-ONYJdFw at mail dot gmail dot com> <CAHjhQ90BGDy1XVWUQui8Tx7PzO0Y6pUAEBXHxTqXH8NAbBGvHw at mail dot gmail dot com> <51927FB1 dot 1070904 at suse dot com> <20130520081458 dot GB814 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ916H83byoyeNnSzMvd7nHqeUp=TqyMuQE0j8hjcAx7_tg at mail dot gmail dot com> <CAHjhQ91625YkBJmtrURhaMpwf63E4UPG00LrtTLnJ=nf=uTi7A at mail dot gmail dot com>
On Thu, Aug 29, 2013 at 06:54:21PM +0400, Liubov Dmitrieva wrote:
> And also it is look very confusing that we don't have same unaligned
> version for mempcpy and still use ssse3 version.
> It is very easy to support mempcpy in memcpy-sse2-unaligned.S file.
>
Thats added in
http://www.sourceware.org/ml/libc-alpha/2013-08/msg00280.html
> --
> Liubov
>
> On Thu, Aug 29, 2013 at 6:45 PM, Liubov Dmitrieva
> <liubov.dmitrieva@gmail.com> wrote:
> > It looks like there is a confusion in the merged patch, I think it is
> > supposed to be (at least looks more logical) the different flag, you
> > only need to turn it on for Buldozer or whatever AMD machines the
> > version is also good.
> >
> > diff --git a/sysdeps/x86_64/multiarch/memcpy.S
> > b/sysdeps/x86_64/multiarch/memcpy.S
> > index a1e5031..f6a44d2 100644
> > --- a/sysdeps/x86_64/multiarch/memcpy.S
> > +++ b/sysdeps/x86_64/multiarch/memcpy.S
> > @@ -33,8 +33,8 @@ ENTRY(__new_memcpy)
> > jne 1f
> > call __init_cpu_features
> > 1: leaq __memcpy_sse2(%rip), %rax
> > - testl $bit_Slow_BSF,
> > __cpu_features+FEATURE_OFFSET+index_Slow_BSF(%rip)
> > - jnz 2f
> > + testl $bit_Fast_Unaligned_Load,
> > __cpu_features+FEATURE_OFFSET+index_Fast_Unaligned_Load(%rip)
> > + jz 2f
> > leaq __memcpy_sse2_unaligned(%rip), %rax
> > ret
> > 2: testl $bit_SSSE3, __cpu_features+CPUID_OFFSET+index_SSSE3(%rip)
> >
> >
> > And you forgot to remove the version which is never used now as memcpy
> > from the ifunc-impl-list:
> >
> >
> > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> > b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> > index 28d3579..d6a7f4f 100644
> > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
> > @@ -224,8 +224,6 @@ __libc_ifunc_impl_list (const char *name, struct
> > libc_ifunc_impl *array,
> >
> > /* Support sysdeps/x86_64/multiarch/memcpy.S. */
> > IFUNC_IMPL (i, name, memcpy,
> > - IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3,
> > - __memcpy_ssse3_back)
> > IFUNC_IMPL_ADD (array, i, memcpy, HAS_SSSE3, __memcpy_ssse3)
> > IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2_unaligned)
> > IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_sse2))
> >
> >
> > --
> > Liubov
> >
> > On Mon, May 20, 2013 at 12:14 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
> >> Commited.
--
Insert coin for new game