This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB


On Fri, Jul 03, 2015 at 09:15:36AM -0400, Carlos O'Donell wrote:
> On 07/03/2015 04:55 AM, OndÅej BÃlka wrote:
> >> At the end of the day it's up to IBM to make the best use of the
> >> tp+offset data stored in the TCB, but every byte you save is another
> >> byte you can use later for something else.
> >>
> > Carlos a problem with this patch is that they ignored community
> > feedback. Early in this thread Florian come with better idea to use
> > GOT+offset that could be accessed as 
> > &hwcap_hack and avoids per-thread runtime overhead.
> 
> Steven and Carlos have not ignored the community feedback, they just
> have a different set of priorities and requirements. There is little
> to discuss if your priorities and requirements are different.
> 
> The use of tp+offset data is indeed a scarce resource that should be
> used only when absolutely necessary or when the use case dictates.
> 
> It is my opinion as a developer, that Carlos' patch is flawed because
> it uses a finite resource, namely tp+offset data, for what I perceive
> to be a flawed design pattern that as a free software developer I don't
> want to encourage. These are not entirely technical arguments though,
> they are subjective and based on my desire to educate and mentor developers
> who write such code. I don't present these arguments as sustained
> opposition to the patch because they are not technical and Carlos
> has a need to accelerate this use case today.
> 
> I have only a few substantive technical issues with the patch. Given
> that the ABI allocates a large block of tp+offset data, I think it is
> OK for IBM to use the data in this way. For example I think it is much
> much more serious that such a built application will likely just crash
> when run with an older glibc. This is a distribution maintenance issue
> that I can't ignore and I'd like to see it solved by a dependency on a
> versioned dummy symbol.
> 
> Lastly, the symbol address hack is an incomplete solution because Florian
> has not provided an implementation. Depending on the implementation it
> may require a new relocation, and that is potentially more costly to the
> program startup than the present process for filling in HWCAP/HWCAP2.

Thats valid concern. My idea was checking if hwcap_hack relocation exist. 
I didn't realize that it scales with number of libraries.

One of reasons why I didn't like this proposal is that it harms linux
ecosystem as  it increases startup cost of a bit everything while its 
unlikely that cross-platform projects will use this.

But these could be done without much of our help. We need to keep these
writable to support this hack. I don't know exact assembly for powerpc,
it should be similar to how do it on x64:

int x;

int foo()
{
#ifdef SHARED
asm ("lea x@GOTPCREL(%rip), %rax; movb $32, (%rax)");
#else
asm ("lea x(%rip), %rax; movb $32, (%rax)");
#endif
return &x;
}


> Without a concrete implementation I can't comment on one or the other.
> It is in my opinion overly harsh to force IBM to go implement this new
> feature. They have space in the TCB per the ABI and may use it for their
> needs. I think the community should investigate symbol address munging
> as a method for storing data in addresses and make a generic API from it,
> likewise I think the community should investigate standardizing tp+offset
> data access behind a set of accessor macros and normalizing the usage
> across the 5 or 6 architectures that use it.
>
I would like this as with access to that I could improve performance of
several inlines.

 
> > Also I now have additional comment with api as if you want faster checks
> > wouldn't be faster to save each bit of hwcap into byte field so you
> > could avoid using mask at each check?
> 
> That is an *excellent* suggestion, and exactly the type of technical
> feedback that we should be giving IBM, and Carlos can confirm if they've
> tried such "unpacking" of the bits into byte fields. Such unpacking is
> common in other machine implementations.
>
Also with unpacking doing that in userspace becomes more attractive so
we don't have to copy 64 bytes for each thread.
 
> Cheers,
> Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]