This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] aarch64: Optimized memcmp for medium to large sizes
- From: Siddhesh Poyarekar <siddhesh at gotplt dot org>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, libc-alpha at sourceware dot org
- Date: Tue, 6 Mar 2018 19:24:58 +0530
- Subject: Re: [PATCH] aarch64: Optimized memcmp for medium to large sizes
- Authentication-results: sourceware.org; auth=none
- References: <20180202045056.3121-1-siddhesh@sourceware.org> <afbe572b-a759-e699-58cd-6bc10c92f950@linaro.org>
On Tuesday 06 March 2018 06:45 PM, Adhemerval Zanella wrote:
> On 02/02/2018 02:50, Siddhesh Poyarekar wrote:
>> This improved memcmp provides a fast path for compares up to 16 bytes
>> and then compares 16 bytes at a time, thus optimizing loads from both
>> sources. The glibc memcmp microbenchmark retains performance (with an
>> error of ~1ns) for smaller compare sizes and reduces up to 31% of
>> execution time for compares up to 4K on the APM Mustang. On Qualcomm
>> Falkor this improves to almost 48%, i.e. it is almost 2x improvement
>> for sizes of 2K and above.
>>
>> * sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a
>> time.
>
> LGTM with some comments clarifications below.
Thanks, fixed up and pushed.
Siddhesh