This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: [PATCH, AARCH64] Optimized memcpy
- From: Marcus Shawcroft <marcus dot shawcroft at gmail dot com>
- To: Wilco Dijkstra <wdijkstr at arm dot com>
- Cc: Newlib Mailing List <newlib at sourceware dot org>
- Date: Thu, 9 Jul 2015 16:03:10 +0100
- Subject: Re: [PATCH, AARCH64] Optimized memcpy
- Authentication-results: sourceware.org; auth=none
- References: <000001d0b98f$7f66f4a0$7e34dde0$ at com>
On 8 July 2015 at 16:05, Wilco Dijkstra <wdijkstr@arm.com> wrote:
> This is an optimized memcpy for AArch64. Copies are split into 3 main cases: small copies of up to
> 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96 bytes
> align the destination and use an unrolled loop processing 64 bytes per iteration. In order to share
> code with memmove, small and medium copies read all data before writing, allowing any kind of
> overlap. On a random copy test memcpy is 40.8% faster on A57 and 28.4% on A53.
>
> ChangeLog:
> 2015-07-08 Wilco Dijkstra <wdijkstr@arm.com>
>
> * newlib/libc/machine/aarch64/memcpy.S (memcpy):
> Rewrite of optimized memcpy.
>
> OK for commit?
OK /Marcus