This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
- From: Peter Bergner <bergner at vnet dot ibm dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>, Carlos Eduardo Seo <cseo at linux dot vnet dot ibm dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Cc: "Steven J. Munroe" <sjmunroe at us dot ibm dot com>, Tulio Machado <tuliom at linux dot vnet dot ibm dot com>
- Date: Tue, 27 Oct 2015 21:49:11 -0500
- Subject: Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
- Authentication-results: sourceware.org; auth=none
- References: <487359FC-25A4-449F-8A43-76340C42C5BC at linux dot vnet dot ibm dot com> <56302282 dot 2080402 at redhat dot com>
On Tue, 2015-10-27 at 21:18 -0400, Carlos O'Donell wrote:
> (a) Use of uint64_t vs. casting.
>
> How much slower is uin64_t vs. casting?
>
> The call to __tcb_parse_hwcap_and_convert_at_platform is
> in the fast path for process startup. So any code we add
> counts, and this is the kind of thing where add a little
> bit at a time until the startup is slower than we wanted.
>
> So if Peter says uint64_t is slow, we should listen and
> make that faster.
There isn't be a difference in 64-bit mode. The hypothetical
slowness is in 32-bit mode, where the 64-bit operations are all
done in register pairs. That said, when I actually look at the
code being generated using uint64_t versus casting, the code looks
very similar, so I retract my objections to using uint64_t.
Looking deeper, what is saving us the slowness is that we're just
doing logical operations that are all limited to the least significant
32-bits of the variables which the compiler recognizes and takes
advantage of. Had we done adds, subtracts, etc. then we'd see
larger/slower code.
Curiously enough, when I look at the 64-bit generated code for
both methods, the code is much larger than the equivalent 32-bit
code. It seems basic block reordering is going wild and duplicating
the setting of __tcb_hwcap and returns. I'll have a look at why
we're doing that. I'll note there is a little bit of the same
problem in 32-bit, but not to the extent seen in 64-bit.
> (b) Unique HWCAP and HWCAP2 offsets in TCB.
>
> I have no opinion. You and Peter need to resolve this
> and ensure gcc operates as expected.
Given what I'm seeing doing logical operations on uint64_t in
32-bit mode, I'm not worried about this any more, so go with
what you have and I'll follow along.
> In summary:
>
> If you fix (a), and the ChangeLog, then it's good for me.
I have no more objections on (a) given what I mentioned above.
Peter