This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.

From: Will Newton <will dot newton at linaro dot org>
To: Ondřej Bílka <neleai at seznam dot cz>
Cc: Måns Rullgård <mans at mansr dot com>, libc-ports at sourceware dot org, Patch Tracking <patches at linaro dot org>
Date: Thu, 18 Apr 2013 10:47:26 +0100
Subject: Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
References: <516BCEE5 dot 9070809 at linaro dot org> <yw1x8v4k6rcc dot fsf at unicorn dot mansr dot com> <CANu=DmjJUZ319+7_M8cyxMga_rYxbGb_QSs87Q29JBdkKX_97g at mail dot gmail dot com> <20130418093900 dot GA3653 at domone dot kolej dot mff dot cuni dot cz>

On 18 April 2013 10:39, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Mon, Apr 15, 2013 at 11:38:49AM +0100, Will Newton wrote:
>> On 15 April 2013 11:06, MÃns RullgÃrd <mans@mansr.com> wrote:
>>
>> Hi MÃns,
>>
>> >> Add a high performance memcpy routine optimized for Cortex-A15 with
>> >> variants for use in the presence of NEON and VFP hardware, selected
>> >> at runtime using indirect function support.
>> >
>> > How does this perform on Cortex-A9?
>>
>> The code is also faster on A9 although the gains are not quite as
>> pronounced. A set of numbers is attached (they linewrap pretty
>> horribly inline).
>>
>>
> I forget to ask where to get benchmark source. Without it there is no
> way to tell if it was done correctly.
> You must randomly vary sizes in range n..2n and also vary alignments.

The benchmark is taken from the cortex-strings package:

https://launchpad.net/cortex-strings

I wrote a wrapper around the benchmark to vary alignment in {1, 2, 4,
8} and a variety of block lengths between 8 and 200.

--
Will Newton
Toolchain Working Group, Linaro

Follow-Ups:
- Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
  - From: OndÅej BÃlka

References:
- [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
  - From: Will Newton
- Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
  - From: Måns Rullgård
- Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
  - From: Will Newton
- Re: [PATCH] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]