This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCHv2] ARM: NEON optimized implementation of memcpy.
On Tue, Jul 14, 2009 at 08:17:22PM +0300, Siarhei Siamashka wrote:
> > We also have a NEON memcpy at CodeSourcery (and performance improvements
> > to non-NEON memcpy), as well as versions of some other string functions,
> > adapted to glibc, that ARM recently contributed to newlib, but those are
> > also waiting on copyright assignments from ARM. I haven't compared the
> > performance of the two implementations.
>
> Do you have this code available for general public somewhere already? I can
> benchmark your implementations of these functions and provide some feedback.
Sure - if you grab our latest Lite Edition tools from the web site
you'll get this code. Either source or binary package.
http://www.codesourcery.com/sgpp/lite/arm
> It looks like __aeabi_memcpy* may need a separate implementation anyway. Any
> extra hops are bad for the performance. Though saving and restoring NEON
> registers should not add too much overhead.
Yes, it ought to get a separate implementation; we haven't done this
yet because GCC doesn't generate calls to them.
The NEON restriction is a bit weird. This function is supposed to be
optimized for large transfers, where NEON is most likely to be useful.
--
Daniel Jacobowitz
CodeSourcery