This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][AArch64] Optimized memcpy/memmove
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Wilco Dijkstra <wdijkstr at arm dot com>
- Cc: 'GNU C Library' <libc-alpha at sourceware dot org>
- Date: Sun, 27 Sep 2015 10:43:19 +0200
- Subject: Re: [PATCH][AArch64] Optimized memcpy/memmove
- Authentication-results: sourceware.org; auth=none
- References: <002901d0f794$66138480$323a8d80$ at com>
On Fri, Sep 25, 2015 at 02:16:33PM +0100, Wilco Dijkstra wrote:
> Further optimize memcpy/memmove for AArch64. Copies are split into 3 main cases: small copies of up
> to 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96
> bytes align the destination and use an unrolled loop processing 64 bytes per iteration. In order to
> share code with memmove, small and medium copies read all data before writing, allowing any kind of
> overlap. All memmoves except for the large backwards case fall into memcpy for optimal performance.
> On a random copy test memcpy/memmove are 40% faster on A57 and 28% on A53.
>
Looks ok on high level, I didn't inspected this patch in detail but you
should test it with dryrun to see real impact on performance.
I would here simply alias memcpy to memmove as there is minimal
performance impact when you do check only for sizes larger than 96
bytes.
> OK for commit?
>
> ChangeLog:
> 2015-09-25 Wilco Dijkstra <wdijkstr@arm.com>
>
> * sysdeps/aarch64/memcpy.S (memcpy):
> Rewrite of optimized memcpy and memmove.
> * sysdeps/aarch64/memmove.S (memmove): Remove
> memmove code (merged into memcpy.S).
> ---
> sysdeps/aarch64/memcpy.S | 350 +++++++++++++++++++++++++++-------------------
> sysdeps/aarch64/memmove.S | 311 +---------------------------------------
> 2 files changed, 210 insertions(+), 451 deletions(-)
>
> diff --git a/sysdeps/aarch64/memcpy.S b/sysdeps/aarch64/memcpy.S
> index b3d550e..51e7268 100644
> --- a/sysdeps/aarch64/memcpy.S
> +++ b/sysdeps/aarch64/memcpy.S
> @@ -9,168 +9,236 @@
>
> The GNU C Library is distributed in the hope that it will be useful,
> but WITHOUT ANY WARRANTY; without even the implied warranty of
> - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> Lesser General Public License for more details.
>
substitution gone awry here.