This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB

From: Peter Bergner <bergner at vnet dot ibm dot com>
To: "Carlos O'Donell" <carlos at redhat dot com>, Carlos Eduardo Seo <cseo at linux dot vnet dot ibm dot com>, GNU C Library <libc-alpha at sourceware dot org>
Cc: "Steven J. Munroe" <sjmunroe at us dot ibm dot com>, Tulio Machado <tuliom at linux dot vnet dot ibm dot com>
Date: Tue, 27 Oct 2015 21:49:11 -0500
Subject: Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
Authentication-results: sourceware.org; auth=none
References: <487359FC-25A4-449F-8A43-76340C42C5BC at linux dot vnet dot ibm dot com> <56302282 dot 2080402 at redhat dot com>

On Tue, 2015-10-27 at 21:18 -0400, Carlos O'Donell wrote:
> (a) Use of uint64_t vs. casting.
> 
> How much slower is uin64_t vs. casting?
> 
> The call to __tcb_parse_hwcap_and_convert_at_platform is
> in the fast path for process startup. So any code we add
> counts, and this is the kind of thing where add a little
> bit at a time until the startup is slower than we wanted.
> 
> So if Peter says uint64_t is slow, we should listen and
> make that faster.
There isn't be a difference in 64-bit mode.  The hypothetical 
slowness is in 32-bit mode, where the 64-bit operations are all
done in register pairs.  That said, when I actually look at the
code being generated using uint64_t versus casting, the code looks
very similar, so I retract my objections to using uint64_t.

Looking deeper, what is saving us the slowness is that we're just
doing logical operations that are all limited to the least significant
32-bits of the variables which the compiler recognizes and takes
advantage of.  Had we done adds, subtracts, etc. then we'd see
larger/slower code.

Curiously enough, when I look at the 64-bit generated code for
both methods, the code is much larger than the equivalent 32-bit
code.  It seems basic block reordering is going wild and duplicating
the setting of __tcb_hwcap and returns.  I'll have a look at why
we're doing that.  I'll note there is a little bit of the same
problem in 32-bit, but not to the extent seen in 64-bit.

> (b) Unique HWCAP and HWCAP2 offsets in TCB.
> 
> I have no opinion. You and Peter need to resolve this
> and ensure gcc operates as expected.

Given what I'm seeing doing logical operations on uint64_t in
32-bit mode, I'm not worried about this any more, so go with
what you have and I'll follow along.

> In summary:
> 
> If you fix (a), and the ChangeLog, then it's good for me.

I have no more objections on (a) given what I mentioned above.

Peter

Follow-Ups:
- Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
  - From: Carlos Eduardo Seo

References:
- [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
  - From: Carlos Eduardo Seo
- Re: [PATCHv6] powerpc: Add hwcap/hwcap2/platform data to TCB
  - From: Carlos O'Donell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]