This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] ARM: Add support for AT_HWCAP2 in _dl_procinfo


On 26/06/14 14:07, Will Newton wrote:
> On 26 June 2014 11:01, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 26/06/14 10:36, Will Newton wrote:
>>> On 26 June 2014 10:14, Richard Earnshaw <rearnsha@arm.com> wrote:
>>>> On 25/06/14 14:12, Will Newton wrote:
>>>>> Add support for the new HWCAP2 values for ARMv8 added in the
>>>>> 3.15 kernel. Tested using QEMU which supports these extensions.
>>>>>
>>>>> ChangeLog:
>>>>>
>>>>> 2014-06-25  Will Newton  <will.newton@linaro.org>
>>>>>
>>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.c
>>>>>       (_dl_arm_cap_flags): Add HWCAP2 values.
>>>>>       * sysdeps/unix/sysv/linux/arm/dl-procinfo.h
>>>>>       (_DL_HWCAP_COUNT): Increase to 37.
>>>>>       (_DL_HWCAP_LAST): New define.
>>>>>       (_DL_HWCAP2_LAST): New define.
>>>>>       (_dl_procinfo): Add support for printing
>>>>>       AT_HWCAP2 entries.
>>>>>       (_dl_string_hwcap): Use _dl_hwcap_string.
>>>>
>>>> I don't have a specific comment about this patch.
>>>>
>>>> I do have a general comment that I think the HWCAPs exported by the
>>>> kernel for 32-bit ARM are a joke.  The principle problem is that there
>>>> is precisely zero way to determine the base architecture.  You cannot
>>>> even tell whether you are running on ARMv6 or ARMv7, let alone whether
>>>> you have key features such as Thumb2.
>>>
>>> I agree, and the situation on AArch64 looks no better. It's pretty
>>> much impossible to determine the micro-architecture from userland too
>>> - which may not be that much of an issue for ARM, but AArch64 likely
>>> much more so
>>>
>>
>> There has to be a better way of addressing that issue than reading the
>> microarchitecture name and then switching on that.  The list is
>> potentially unbounded: what do you do when you encounter a new
>> micro-architecture?
> 
> In the context of an ifunc resolver, do what you do now which is use
> the hwcap bits. 

And that's my major gripe.  They don't work, since they don't tell you
what you really need to know.  There's no HWCAP bit to say Thumb2;
similarly there's no HWCAP bit to say that you've got the integer SIMD
instructions or the v5e DSP instructions.  It's assumed you can work
this out from the architecture name; but there's insufficient
specification for that to make it work going forwards to new
architecture variants.

The net result is that you have to start making inferences based on
other HWCAPS (eg if I've got Neon, then I've got thumb2); but the
inverse is not always implied (for example, if I have thumb2 I don't
necessarily have Neon), so there will still be cases that are missed.
It's also dangerous to do this since some of the inferences may end up
being incorrect in the long run.

I don't have an answer to any of this as of today, other than to go read
the full hardware features bits out of CP15; but that's a privileged
operation not available to user-space code.



> The use case that I can imagine is where we have a
> microarchitecture that suffers from a particularly poor behaviour with
> e.g. the default memcpy then we can switch it to use a custom version.
> At the moment the only way I can see to deal with that is read /proc
> and stash that information inside ld.so somewhere at startup but that
> is rather ugly...
> 

For some of these, puting the relevant code in a VDSO might be a better
approach; at least for critical system routines.

>>>> I've heard it suggested that you can part the architecture string (eg
>>>> armv7l), but 1) the format of this string is not precisely defined in a
>>>> way that allows you to predict what future cores will generate and 2)
>>>> parsing strings in ifunc code when function calls can't be made is
>>>> likely to be hairy at best.
>>>
>>> I think the platform is probably the best way to pass that info. The
>>> kernel currently sets it to:
>>>
>>>         snprintf(elf_platform, ELF_PLATFORM_SIZE, "%s%c",
>>>                  list->elf_name, ENDIANNESS);
>>>
>>> Where elf_name is one of:
>>>
>>> v4
>>> v5
>>> v5t
>>> v6
>>> v7
>>> v7m
>>>
>>> That doesn't look too intractable, and we can work with the kernel
>>> guys to make sure nothing too surprising is added there.
>>
>> Until you realize that these do not have a total ordering; that is,
>> while you can write
>>
>> v4 < v5 < v5t < v6 < v7
>>
>> You cannot insert v7m in that list at any point, since it is both more
>> and less than v6.  In fact, it's both more and less than the baseline
>> v7, since it also has a divide instruction.
> 
> I don't see why we would care about ordering these values - isn't it
> just a symbolic value? What use case do you have in mind?

Again, the sort of question like, can I use Thumb2 instructions?

> 
>>> The string
>>> parsing of architecture revision is pretty trivial in those cases.
>>> Perhaps this is something we can discuss at the GNU Tools Cauldron
>>> next month.
>>>
>>> On AArch64 the platform string is hardcoded to "aarch64" or "aarch64_be". :-/
>>>
>>
>> Yeah, but then, I don't think reading this string is a useful way of
>> solving this problem.
> 
> Do you have an alternative proposal? ;-)
> 

Not really, at this time.  Sorry.

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]