This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB


On Thu, Jul 09, 2015 at 04:31:17PM -0300, Adhemerval Zanella wrote:
> 
> 
> On 09-07-2015 16:02, OndÅej BÃlka wrote:
> > On Tue, Jul 07, 2015 at 10:35:24AM -0500, Steven Munroe wrote:
> >> Not so simple on PowerISA as we don't have PC-relative addressing.
> >>
> >> 1) The global entry requires 2 instruction to establish the TOC/GOT
> >> 2) Medium model requires two instructions (fused) to load a pointer from
> >> the GOT.
> >> 3) Finally we can load the cached hwcap.
> >>
> >> None of this is required for the TP+offset.
> >>
> > And why you didn't wrote that when it was first suggested? When you don't answer 
> > it looks like you don't want to answer because that suggestion is better.
> > 
> > Here problem isn't lack of relative addressing but that you don't start
> > with GOT in register. 
> > 
> > You certainly could do similar hack as you do with tcb and place hwcap
> > bits just after that so you could do just one load.
> > 
> > That you require so many instructions on powerpc is gcc bug, rather than
> > rule. You don't need that many instructions when you place frequent
> > symbols in -32768..32767 range. For example here you could save one
> > addition.
> > 
> > int x, y;
> > int foo()
> > {
> >   return x + y;
> > }
> > 
> > original
> > 
> > 00000000000007d0 <foo>:
> >  7d0:	02 00 4c 3c 	addis   r2,r12,2
> >  7d4:	30 78 42 38 	addi    r2,r2,30768
> >  7d8:	00 00 00 60 	nop
> >  7dc:	30 80 42 e9 	ld      r10,-32720(r2)
> >  7e0:	00 00 00 60 	nop
> >  7e4:	38 80 22 e9 	ld      r9,-32712(r2)
> >  7e8:	00 00 6a 80 	lwz     r3,0(r10)
> >  7ec:	00 00 29 81 	lwz     r9,0(r9)
> >  7f0:	14 4a 63 7c 	add     r3,r3,r9
> >  7f4:	b4 07 63 7c 	extsw   r3,r3
> >  7f8:	20 00 80 4e 	blr
> > 
> > new
> > 
> >  	addis   r2,r12,2
> > 	ld      r10,-1952(r2)
> > 	ld      r9,-1944(r2)
> > 	lwz     r3,0(r10)
> > 	lwz     r9,0(r9)
> > 	add     r3,r3,r9
> > 	extsw   r3,r3
> > 	blr
> 
> No you can't, you need to take in consideration powerpc64le ELFv2 ABi has two
> entrypoints for every function, global and local, with former being used when
> you need to materialize the TOC while latter you can use the same TOC. And
> compiler has no information regarding this, it has to be decided by the linker.
>
Of course I can, reusing TOC is not mandatory. That would just decrease
performance a bit for local.

You need majority of calls be from different dso to use global.
Otherwise if you use local entrypoint there is no reason to use tcb as
hidden variable does same job (and you could use local entrypoint in
plt of same dso.). A example that I previously mentioned is
compiled by

 gcc hw.c h.o -O3 -fPIC  -mcmodel=medium -shared

extern int __hwcap __attribute__ ((visibility ("hidden"))) ;
int foo(int x, int y)
{
  if (__hwcap)
    return x;
  else
    return y;
}

into

0000000000000750 <foo>:
 750:	02 00 4c 3c 	addis   r2,r12,2
 754:	b0 78 42 38 	addi    r2,r2,30896
 758:	00 00 00 60 	nop
 75c:	54 80 22 81 	lwz     r9,-32684(r2)
 760:	00 00 89 2f 	cmpwi   cr7,r9,0
 764:	20 00 9e 4c 	bnelr   cr7
 768:	78 23 83 7c 	mr      r3,r4
 76c:	20 00 80 4e 	blr

which with local entry uses only one load as tcb proposal.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]