This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.


On Mon, 12 Aug 2013, Will Newton wrote:

> A small change to the entry to the aligned copy loop improves
> performance slightly on A9 and A15 cores for certain copies.

Could you clarify what you mean by "certain copies"?

In particular, have you verified that for all three choices in this code 
(NEON, VFP or neither), the code for unaligned copies is at least as fast 
in this case (common 32-bit alignment, but not common 64-bit alignment) as 
the code that would previously have been used in those cases?

There are various comments regarding alignment, whether stating "LDRD/STRD 
support unaligned word accesses" or referring to the mutual alignment that 
applies for particular code.  Does this patch make any of them out of 
date?  (If code can now only be reached with common 64-bit alignment, but 
in fact requires only 32-bit alignment, the comment should probably state 
both those things explicitly.)

-- 
Joseph S. Myers
joseph@codesourcery.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]