This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: libc-alpha at sourceware dot org
- Date: Thu, 9 Jul 2015 23:51:30 +0200
- Subject: Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- Authentication-results: sourceware.org; auth=none
- References: <55760314 dot 6070601 at linux dot vnet dot ibm dot com> <559617FF dot 8010100 at redhat dot com> <20150703085542 dot GE32307 at domone> <55968AF8 dot 8060104 at redhat dot com> <20150703171121 dot GA23898 at domone> <1436283324 dot 12188 dot 25 dot camel at oc7878010663> <20150709190252 dot GD18030 at domone> <559ECC05 dot 8040901 at linaro dot org>
On Thu, Jul 09, 2015 at 04:31:17PM -0300, Adhemerval Zanella wrote:
>
>
> On 09-07-2015 16:02, OndÅej BÃlka wrote:
> > On Tue, Jul 07, 2015 at 10:35:24AM -0500, Steven Munroe wrote:
> >> Not so simple on PowerISA as we don't have PC-relative addressing.
> >>
> >> 1) The global entry requires 2 instruction to establish the TOC/GOT
> >> 2) Medium model requires two instructions (fused) to load a pointer from
> >> the GOT.
> >> 3) Finally we can load the cached hwcap.
> >>
> >> None of this is required for the TP+offset.
> >>
> > And why you didn't wrote that when it was first suggested? When you don't answer
> > it looks like you don't want to answer because that suggestion is better.
> >
> > Here problem isn't lack of relative addressing but that you don't start
> > with GOT in register.
> >
> > You certainly could do similar hack as you do with tcb and place hwcap
> > bits just after that so you could do just one load.
> >
> > That you require so many instructions on powerpc is gcc bug, rather than
> > rule. You don't need that many instructions when you place frequent
> > symbols in -32768..32767 range. For example here you could save one
> > addition.
> >
> > int x, y;
> > int foo()
> > {
> > return x + y;
> > }
> >
> > original
> >
> > 00000000000007d0 <foo>:
> > 7d0: 02 00 4c 3c addis r2,r12,2
> > 7d4: 30 78 42 38 addi r2,r2,30768
> > 7d8: 00 00 00 60 nop
> > 7dc: 30 80 42 e9 ld r10,-32720(r2)
> > 7e0: 00 00 00 60 nop
> > 7e4: 38 80 22 e9 ld r9,-32712(r2)
> > 7e8: 00 00 6a 80 lwz r3,0(r10)
> > 7ec: 00 00 29 81 lwz r9,0(r9)
> > 7f0: 14 4a 63 7c add r3,r3,r9
> > 7f4: b4 07 63 7c extsw r3,r3
> > 7f8: 20 00 80 4e blr
> >
> > new
> >
> > addis r2,r12,2
> > ld r10,-1952(r2)
> > ld r9,-1944(r2)
> > lwz r3,0(r10)
> > lwz r9,0(r9)
> > add r3,r3,r9
> > extsw r3,r3
> > blr
>
> No you can't, you need to take in consideration powerpc64le ELFv2 ABi has two
> entrypoints for every function, global and local, with former being used when
> you need to materialize the TOC while latter you can use the same TOC. And
> compiler has no information regarding this, it has to be decided by the linker.
>
Of course I can, reusing TOC is not mandatory. That would just decrease
performance a bit for local.
You need majority of calls be from different dso to use global.
Otherwise if you use local entrypoint there is no reason to use tcb as
hidden variable does same job (and you could use local entrypoint in
plt of same dso.). A example that I previously mentioned is
compiled by
gcc hw.c h.o -O3 -fPIC -mcmodel=medium -shared
extern int __hwcap __attribute__ ((visibility ("hidden"))) ;
int foo(int x, int y)
{
if (__hwcap)
return x;
else
return y;
}
into
0000000000000750 <foo>:
750: 02 00 4c 3c addis r2,r12,2
754: b0 78 42 38 addi r2,r2,30896
758: 00 00 00 60 nop
75c: 54 80 22 81 lwz r9,-32684(r2)
760: 00 00 89 2f cmpwi cr7,r9,0
764: 20 00 9e 4c bnelr cr7
768: 78 23 83 7c mr r3,r4
76c: 20 00 80 4e blr
which with local entry uses only one load as tcb proposal.