This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] aarch64: Improve strncmp for mutually misaligned inputs
On 13/03/18 13:12, Szabolcs Nagy wrote:
On 13/03/18 09:03, Siddhesh Poyarekar wrote:
Ping!
On Tuesday 06 March 2018 07:17 PM, Siddhesh Poyarekar wrote:
The mutually misaligned inputs on aarch64 are compared with a simple
byte copy, which is not very efficient. Enhance the comparison
similar to strcmp by loading a double-word at a time. The peak
performance improvement (i.e. 4k maxlen comparisons) due to this on
the strncmp microbenchmark is as follows:
falkor: 3.5x (up to 72% time reduction)
cortex-a73: 3.5x (up to 71% time reduction)
cortex-a53: 3.5x (up to 71% time reduction)
All mutually misaligned inputs from 16 bytes maxlen onwards show
upwards of 15% improvement and there is no measurable effect on the
performance of aligned/mutually aligned inputs.
* sysdeps/aarch64/strncmp.S (count): New macro.
(strncmp): Store misaligned length in SRC1 in COUNT.
(mutual_align): Adjust.
(misaligned8): Load dword at a time when it is safe.
OK to commit.
(it would be nice to have the equivalent change in newlib too..)
this broke the build for me
../sysdeps/aarch64/strncmp.S: Assembler messages:
../sysdeps/aarch64/strncmp.S:211: Error: unexpected characters following instruction at operand 2 -- `mov x13,x2,lsr#3'
../sysdeps/aarch64/strncmp.S:217: Error: unexpected characters following instruction at operand 2 -- `mov x13,x2,lsr#3'
old binutils 2.26 and before did not support mov with shifted
register (only orr reg,xzr,reg,shift).
but i think a shift instruction (lsr) should be better anyway
(on most implementations).
can you please fix this?