This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64] Optimized memcpy/memmove


ping

________________________________________
From: Wilco Dijkstra
Sent: 15 December 2015 16:40
To: 'GNU C Library'
Cc: nd
Subject: Re: [PATCH][AArch64] Optimized memcpy/memmove

-----Original Message-----
From: Wilco Dijkstra [mailto:wdijkstr@arm.com]
Sent: 25 September 2015 14:17
To: 'GNU C Library'
Subject: [PATCH][AArch64] Optimized memcpy/memmove

Further optimize memcpy/memmove for AArch64. Copies are split into 3 main cases: small copies of up to 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96 bytes align the destination and use an unrolled loop processing 64 bytes per iteration. In order to share code with memmove, small and medium copies read all data before writing, allowing any kind of overlap. All memmoves except for the large backwards case fall into memcpy for optimal performance. On a random copy test memcpy/memmove are 40% faster on A57 and 28% on A53.

OK for commit?

ChangeLog:
2015-09-25  Wilco Dijkstra  <wdijkstr@arm.com>

        * sysdeps/aarch64/memcpy.S (memcpy):
        Rewrite of optimized memcpy and memmove.
        * sysdeps/aarch64/memmove.S (memmove): Remove
        memmove code (merged into memcpy.S).

Attachment: 0001-Optimized-memcpy.txt
Description: 0001-Optimized-memcpy.txt


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]