This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- From: Will Newton <will dot newton at linaro dot org>
- To: Ondřej Bílka <neleai at seznam dot cz>
- Cc: libc-ports at sourceware dot org, Patch Tracking <patches at linaro dot org>
- Date: Mon, 15 Apr 2013 11:59:27 +0100
- Subject: Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- References: <516BCEE5 dot 9070809 at linaro dot org> <CANu=DmhNPNDCy8mMw6q41+kA_WDMPRXWqq2kuzNOgfCB3wfQ6g at mail dot gmail dot com> <20130415102327 dot GA7032 at domone dot kolej dot mff dot cuni dot cz>
On 15 April 2013 11:23, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Mon, Apr 15, 2013 at 11:01:37AM +0100, Will Newton wrote:
>> Attached are a set of benchmarks of the new code versus the existing
>> memcpy implementation on a Cortex-A15 platform.
>>
>
> As I wrote at previous thread:
>
> On Thu, Apr 04, 2013 at 08:37:01AM +0200, OndÅej BÃlka wrote:
>> Try also benchmark with real world data (20MB). I put it on
>> http://kam.mff.cuni.cz/~ondra/dryrun_memcpy.tar.bz2
>>
>> To add neon copy test_generic.c file and add compiling neon
>> implementation to benchmark script.
>>
>> It now only measures total time.
>> I would need something like timestamp counter for more detailed
>> results.
>
> How good it fares on my benchmark?
It wasn't clear to me how to integrate my code and run the tests - I
built a version of replay.c with each memcpy implementation and the
new one ran in 20% less time, but I don't know if I did that
correctly.
--
Will Newton
Toolchain Working Group, Linaro