This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH, v5] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- From: Will Newton <will dot newton at linaro dot org>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: libc-ports at sourceware dot org, Patch Tracking <patches at linaro dot org>
- Date: Wed, 1 May 2013 16:26:04 +0100
- Subject: Re: [PATCH, v5] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- References: <517FF73E dot 5020509 at linaro dot org> <20130430171818 dot 697972C08A at topped-with-meat dot com>
On 30 April 2013 18:18, Roland McGrath <roland@hack.frob.com> wrote:
Hi Roland,
Thanks for the review!
>> +++ b/ports/sysdeps/arm/armv7/multiarch/aeabi_memcpy.c
>> @@ -0,0 +1,33 @@
>> +/* Copyright (C) 2005-2013 Free Software Foundation, Inc.
>
> The first line of each new file should be a descriptive comment.
>
>> +void *__memcpy_arm (void *dest, const void *src, size_t n);
>> +
>> +/* Copy memory like memcpy, but no return value required. Can't alias
>> + to memcpy because it's not defined in the same translation
>> + unit. */
>
> Why not just define the aliases in memcpy.S instead?
> (Then this can be an empty file just to override arm/aeabi_memcpy.c.)
>
> You should also add some comments about why it's important that the
> __aeabi_* functions use __memcpy_arm rather than memcpy.
Done.
>> +++ b/ports/sysdeps/arm/armv7/multiarch/ifunc-impl-list.c
>> @@ -0,0 +1,44 @@
>> +/* Enumerate available IFUNC implementations of a function. arm version.
>
> ARM in caps.
Done.
>> + IFUNC_IMPL_ADD (array, i, memcpy, hwcap & HWCAP_ARM_VFPv3,
>> + __memcpy_vfp)
>
> HWCAP_ARM_VFP.
Yep, not sure how that slipped back in.
>> +ENTRY(memcpy)
>> + .type memcpy, %gnu_indirect_function
>> + ldr r1, .Lmemcpy_arm
>> + tst r0, #HWCAP_ARM_VFP
>> + ldrne r1, .Lmemcpy_vfp
>
> If __SOFTFP__ is predefined by the compiler, then the compiler is presuming
> VFP support anyway. So you can make this:
>
> #ifdef __SOFTFP__
> ldr r1, .Lmemcpy_arm
> tst r0, #HWCAP_ARM_VFP
> ldrne r1, .Lmemcpy_vfp
> #else
> ldr r1, .Lmemcpy_vfp
> #endif
>
> (and also conditionalize .Lmemcpy_arm, below).
I'm not sure I follow the logic here, could you elaborate?
>> +# If we are configuring for armv7 we need binutils 2.21 to ensure that
>> +# NEON alignments are assembled correctly.
>> +if test $machine = arm/armv7; then
>> + AC_CHECK_PROG_VER(AS, $AS, --version,
>> + [GNU assembler.* \([0-9]*\.[0-9.]*\)],
>> + [2.1[0-9][0-9]*|2.[2-9][1-9]*|[3-9].*|[1-9][0-9]*], AS=: critic_missing="$critic_missing as")
>> +fi
>
> Just put this in sysdeps/arm/armv7/configure.in and don't test $machine.
> Whenever possible, it is far better to test for an actual relevant detail
> empirically rather than testing version numbers. If you can show an
> example of an instruction sequence that is misassembled by binutils 2.20
> then we can help you construct a more precise configure check.
I've moved the configure fragment as you suggest. I'll look into the
creating a test for the assembler breakage.
--
Will Newton
Toolchain Working Group, Linaro