This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- From: Rich Felker <dalias at libc dot org>
- To: libc-alpha at sourceware dot org
- Date: Tue, 9 Jun 2015 12:45:47 -0400
- Subject: Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- Authentication-results: sourceware.org; auth=none
- References: <55760314 dot 6070601 at linux dot vnet dot ibm dot com> <5576FC80 dot 1090806 at arm dot com> <1433862393 dot 21101 dot 9 dot camel at sjmunroe-ThinkPad-W500> <20150609154223 dot GA20028 at domone> <1433865684 dot 21101 dot 20 dot camel at sjmunroe-ThinkPad-W500>
On Tue, Jun 09, 2015 at 11:01:24AM -0500, Steven Munroe wrote:
> On Tue, 2015-06-09 at 17:42 +0200, OndÅej BÃlka wrote:
> > On Tue, Jun 09, 2015 at 10:06:33AM -0500, Steven Munroe wrote:
> > > On Tue, 2015-06-09 at 15:47 +0100, Szabolcs Nagy wrote:
> > > >
> > > > On 08/06/15 22:03, Carlos Eduardo Seo wrote:
> > > > > The proposed patch adds a new feature for powerpc. In order to get
> > > > > faster access to the HWCAP/HWCAP2 bits, we now store them in the TCB.
> > > > > This enables users to write versioned code based on the HWCAP bits
> > > > > without going through the overhead of reading them from the auxiliary
> > > > > vector.
> > >
> > > > i assume this is for multi-versioning.
> > >
> > > The intent is for the compiler to implement the equivalent of
> > > __builtin_cpu_supports("feature"). X86 has the cpuid instruction, POWER
> > > is RISC so we use the HWCAP. The trick to access the HWCAP[2]
> > > efficiently as getauxv and scanning the auxv is too slow for inline
> > > optimizations.
> > >
> > > > i dont see how the compiler can generate code to access the
> > > > hwcap bits currently (without making assumptions about libc
> > > > interfaces).
> > > >
> > > These offset will become a durable part the PowerPC 64-bit ELF V2 ABI.
> > >
> > > The TCB offsets are already fixed and can not change from release to
> > > release.
> > >
> > I don't have problem with this but why do you add tls, how can different
> > threads have different ones when kernel could move them between cores.
> >
> > So instead we just add to libc api following two variables below. These would
> > be initialized by linker as we will probably use them internally.
> >
> > extern int __hwcap, __hwcap2;
> >
> The Power ABI's address the TCB off a dedicated GPR (R2 or R13). This
> guarantees one instruction load from TCB.
>
> A Static variable would require a an indirect load via the TOC/GOT
I do not see this as a justification. There are a lot more pressing
things with respect to performance that could be micro-optimized by
adding TCB ABI for them, but it's not done because it's the wrong
solution.
> (which can be megabytes for a large program/library). I really really
> want the avoid that.
The size of the GOT is utterly irrelevant to the performance reading
an element from it, so I don't see why you brought this up.
Rich