This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64] Optimized memcpy/memmove


On Fri, Sep 25, 2015 at 02:16:33PM +0100, Wilco Dijkstra wrote:
> Further optimize memcpy/memmove for AArch64. Copies are split into 3 main cases: small copies of up
> to 16 bytes, medium copies of 17..96 bytes which are fully unrolled. Large copies of more than 96
> bytes align the destination and use an unrolled loop processing 64 bytes per iteration. In order to
> share code with memmove, small and medium copies read all data before writing, allowing any kind of
> overlap. All memmoves except for the large backwards case fall into memcpy for optimal performance.
> On a random copy test memcpy/memmove are 40% faster on A57 and 28% on A53.
>

Looks ok on high level, I didn't inspected this patch in detail but you 
should test it with dryrun to see real impact on performance. 

I would here simply alias memcpy to memmove as there is minimal
performance impact when you do check only for sizes larger than 96
bytes.

> OK for commit?
> 
> ChangeLog:
> 2015-09-25  Wilco Dijkstra  <wdijkstr@arm.com>
> 
> 	* sysdeps/aarch64/memcpy.S (memcpy):
> 	Rewrite of optimized memcpy and memmove.
> 	* sysdeps/aarch64/memmove.S (memmove): Remove
> 	memmove code (merged into memcpy.S).

> ---
>  sysdeps/aarch64/memcpy.S  | 350 +++++++++++++++++++++++++++-------------------
>  sysdeps/aarch64/memmove.S | 311 +---------------------------------------
>  2 files changed, 210 insertions(+), 451 deletions(-)
> 
> diff --git a/sysdeps/aarch64/memcpy.S b/sysdeps/aarch64/memcpy.S
> index b3d550e..51e7268 100644
> --- a/sysdeps/aarch64/memcpy.S
> +++ b/sysdeps/aarch64/memcpy.S
> @@ -9,168 +9,236 @@
>  
>     The GNU C Library is distributed in the hope that it will be useful,
>     but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
>     Lesser General Public License for more details.
>
substitution gone awry here.  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]