This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: powerpc __tls_get_addr call optimization


On 03/20/2015 03:55 AM, Alan Modra wrote:
> On Thu, Mar 19, 2015 at 11:33:16PM -0400, Carlos O'Donell wrote:
>> On 03/18/2015 10:56 PM, Alan Modra wrote:
>>> On Wed, Mar 18, 2015 at 01:07:32PM -0400, Carlos O'Donell wrote:
>>>> On 03/18/2015 02:11 AM, Alan Modra wrote:
>>>>> Now that Alex's fixes for static TLS have gone in, I figure it's worth
>>>>> revisiting an old patch of mine.
>>>>> https://sourceware.org/ml/libc-alpha/2009-03/msg00053.html
>>>>
>>>> I'm not against this patch, but it certainly seems like you would be
>>>> better served by just implementing tls descriptors?
>>>
>>> I think this is one better than tls descriptors, because powerpc
>>> avoids the indirect function call used by tls descriptors.
>>
>> You mean to say it is "faster" than tls descriptors, but at the same
> 
> To be honest, there isn't much difference in the optimized case where
> static TLS is available.  It boils down to an indirect call to a
> function that loads one value vs. a direct call to a stub that loads
> two values and compares one against zero.  I think what I've
> implemented is slightly better for PowerPC, but whether that would
> carry over to other architectures is debatable.

I agree that what you have implemented is faster for power.

>> time "harder" to maintain because it's a custom implementation that
>> anyone debugging glibc has to learn about. That's not a bad thing,
>> I just want us all to acknowledge the tradeoff.
> 
> Well, yes, but the PowerPC implementation is all in dl-machine.h, and
> looks very similar to x86_64 in use of CHECK_STATIC_TLS,
> TRY_STATIC_TLS and modification of the tls_index entry.  PowerPC
> doesn't have the complication and potential failure of allocating
> extended descriptors.  We also don't need to pass extra flags to gcc
> to enable the optimization.

I also agree that your present implementation mirrors TLS DESC in
the implementation and reuse of CHECK_STATIC_TLS/TRY_STAIC_TLS,
and I like that aspect of the change.

>> The present goal for glibc and the toolchain in general has been
>> to move to TLS descriptors, and thus provide a way for the dozen or
>> so packages in the distribution to stop doing this:
>>
>> mesa (src/mapi/u_current.h):
>>
>> extern __thread struct mapi_table *u_current_table
>>     __attribute__((tls_model("initial-exec")));
>>
>> They would instead use TLS descriptors, and the above markings would
>> be removed and the access would be as fast as possible without needing
>> to specify the IE model.
>>
>> These packages are sometimes linked with applications, and sometimes
>> arbitrarily dlopened.
>>
>> Would this present optimization you propose for power support this
>> use case?
> 
> Sure.  This is exactly the use case the powerpc optimization tackles,
> shared libraries using general dynamic or local dynamic TLS access.
> Like TLS descriptors, it can also handle general dynamic or local
> dynamic TLS access in an executable, but these will normally be
> optimized to IE or LE by GNU ld.

Perfect, just making sure were were on the same page. I figured, after
reading the binutils patch this is mostly operated like TLS DESC, but
slightly optimized for power.

>> Would it use static TLS for the above access if it could and fall
>> back gracefully if it can't?
> 
> Yes.

Good. I expected that it would simply degenerate to a call to
__tls_get_addr if it can't get static tls space.

>> What I want to make sure is that Power isn't left behind when we
>> eventually transition everyone else to TLS Descriptors and remove
>> the above markings from source programs.
> 
> Other architectures left behind by the PowerPC implementation might
> like to transition from TLS descriptors.  Just kidding.  :)

Given your answers above I'm happy to see this go into glibc.

The patch itself looks fine to me, the real magic is in binutils
with yet another super-secret stub that has no debug information
and must be recognized by memory by the person doing the debugging :}

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]