This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
Re: [PATCH, v3] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- From: Will Newton <will dot newton at linaro dot org>
- To: "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: libc-ports at sourceware dot org, Patch Tracking <patches at linaro dot org>
- Date: Tue, 30 Apr 2013 10:46:46 +0100
- Subject: Re: [PATCH, v3] ARM: Add Cortex-A15 optimized NEON and VFP memcpy routines, with IFUNC.
- References: <51752F94 dot 8080105 at linaro dot org> <Pine dot LNX dot 4 dot 64 dot 1304261652090 dot 27966 at digraph dot polyomino dot org dot uk>
On 26 April 2013 18:05, Joseph S. Myers <joseph@codesourcery.com> wrote:
Hi Joseph,
Thanks for the further patch review!
>> diff --git a/ports/sysdeps/arm/armv7/multiarch/ifunc-impl-list.c b/ports/sysdeps/arm/armv7/multiarch/ifunc-impl-list.c
>
>> +#include <assert.h>
>> +#include <string.h>
>> +#include <wchar.h>
>> +#include <ldsodefs.h>
>> +#include <sysdep.h>
>> +#include <ifunc-impl-list.h>
>
> What's the need for <assert.h> and <wchar.h>? I don't see anything
> obviously using them in this file. (If it's an indirect use, it's
> probably better to make the header in question include the other header it
> needs, so each file is including exactly those headers from which it uses
> functionality directly.)
They aren't used, I've removed them.
>> + IFUNC_IMPL_ADD (array, i, memcpy, hwcap & HWCAP_ARM_VFPv3,
>> + __memcpy_vfp)
>
> Why VFPv3 here? In the .S file you use HWCAP_ARM_VFP.
Yes, VFP is the correct constant, fixed.
>> diff --git a/ports/sysdeps/arm/armv7/multiarch/memcpy.S b/ports/sysdeps/arm/armv7/multiarch/memcpy.S
>
>> +#include <arm-features.h>
>
> I don't see any actual use of anything from arm-features.h here.
> (Arguably you *should* be using it - if ARM_HAVE_VFP, you don't need to
> consider the possibility of VFP being missing, although you still need to
> build __memcpy_arm for __aeabi_memcpy use. And if the compiler has NEON
> enabled for the build, __memcpy_vfp isn't needed at all, although in that
> case it would be better to avoid all the IFUNC handling and just have
> __memcpy_arm for __aeabi_memcpy and use in ld.so. But in any case, either
> use arm-features.h or remove the include.)
I've removed the include.
>> +/* Old versions of GAS incorrectly implement the NEON align semantics. */
>
> State explicit version numbers, "versions before x.y". That way we can
> tell when this code is obsolete because the versions in question aren't
> supported by glibc.
>
> But, since I don't see a configure test that would set this
> BROKEN_ASM_NEON_ALIGN macro, if it's relevant for binutils 2.20 (the
> oldest version supported by glibc at present) then there should be such a
> test, and if it's not relevant for 2.20 then the conditionals shouldn't be
> here at all. (Or, you could make a case for a more recent binutils
> version as the minimum on ARM, but there should probably be a configure
> test for that if a more recent version is needed.)
AFAIK binutils 2.20 and earlier are broken. I've added a configure
test to make sure arm builds with 2.21.
>> +#define PC_OFFSET 8 /* PC pipeline compensation. */
>
> We have a macro PC_OFS in sysdep.h, I suggest using it.
Done.
>> +.Lcpy_not_short:
>> + /* At least 64 bytes to copy, but don't know the alignment yet. */
>> + str tmp2, [sp, #-FRAME_SIZE]!
>
> All stack adjustments and saves/restores of call-preserved registers
> should have associated CFI annotations for debug info (you may also need
> cfi_remember_state / cfi_restore_start in appropriate places where
> branches are involved, so that the CFI is correct at all places in the
> function; this is just the first example of a place needing CFI that I
> saw).
I've added CFI annotations which are hopefully correct. I would
appreciate if you could have a look at the next version of the patch
though as I'm not that conversant with CFI.
Thanks!
--
Will Newton
Toolchain Working Group, Linaro