This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Preventing preemption of 'protected' symbols in GNU ld 2.26 [aka should we revert the fix for 65248]


> As one of the strong advocates for the fix that was made to make
> protected visibility work correctly with data symbols, I'd like to
> explain why it was the right decision and why it matters. This whole
> process is really frustrating to me -- having invested a lot of effort
> into getting something important fixed, only to have people come
> trying to break it again -- but I'm going to try to be calm and not to
> snap at anybody.

Ironically, you've just described my feelings almost exactly, only I
come here finding that someone already broke it, and I'm trying to get
it fixed again.

With all due respect, I think you're misinterpreting what the
visibility feature was intended for, and you're projecting your own
needs onto everyone else that uses this feature. I can state with
first-hand knowledge that the intent was so that compilers could
optimize access to the protected data symbols based on an assumption
that they are "relatively nearby". Here's a quote from the gABI
proposal (the latest revision that I could find, dated April 16,
1999), which was submitted by Jim Dehnert, then at SGI:

"Optimization Note:

"The visibility semantics of these attributes allow various
optimizations. While care must be taken to maintain
position-independence and proper GOT usage for references to and
definitions of symbols which might be preempted by or referenced from
other components, these restrictions all allow references from the
same component to make stricter assumptions about the definitions.
References to protected symbols (and hence to hidden or internal
symbols) may be optimized by using absolute or PC-relative addresses
in executable files or by assuming addresses to be relatively nearby.
Internal functions (as defined in the MIPS ABI) do not normally
require gp establishment code even in psABIs requiring callee
establishment/restore of gp, because they will always be entered from
the same component with the correct gp already in place from the
caller."

Unfortunately, this optimization note didn't make it into the gABI,
probably because the editor felt it was unnecessary, and because it
contained some MIPS-specific details. Nevertheless, it clearly shows
the intent.

> From a programming standpoint, the semantics of protected visibility
> need to be free of arch-specific implementation details. Otherwise
> programmers can't use it without hard-coding arch-specific details,
> which for practical purposes, means good software can't use it at all.

It's unfortunate that copy relocations intrude on your stated goal,
but I'd prefer to fix that problem without breaking (and I do mean
"break") the original intent behind protected visibility.

> My original motivation for wanting protected visibility to "just work"
> was to be able to use:
>
>         #pragma GCC visibility push(protected)
>
> around the inclusion of a library's public headers when including them
> from the implementation files, This is far from being the only usage
> case, and I'll expand on more important usage cases below, but it is
> an important one because it allows you to eliminate all GOT/PLT cost
> of intra-library function calls without any fine-grained maintenance
> of which declarations to apply visibility too (and without any
> GNUC-specific clutter in the header files themselves).
>
> I understand that some people want protected visibility to avoid the
> GOT for data symbols too for the sake of performance, but for my usage
> case, the fact that the semantics were wrong for data symbols meant
> that my configure check for "does protected visibility work" would
> return "no", and the whole optimization would get turned off.

Yes, some people do want protected visibility to avoid the GOT for
data symbols, and it makes a significant difference in many cases. For
those people, the changes I'm objecting to cause a performance
regression.

> Anyway, let's move past optimization, because it's a distraction.

I disagree. Optimization isn't a distraction -- it's the whole
motivation for the feature.

> After all, with the old (broken) behavior of protected data, one
> _could_ work around the above problem and still get the performance
> benefits for functions without breaking data by explicitly declaring
> all data with default visibility. In fact, this is how I solve the
> problem in musl libc, where there are only a small number of data
> symbols that should be externally accessible, and maintaining a list
> of them is managable:
>
> http://git.musl-libc.org/cgit/musl/tree/src/internal/vis.h?id=d1b29c2a54588401494c1a3ac7103c1e91c61fa1
>
> This is done for the sake of compatibility with a wide range of
> toolchains including ones with the old/broken behavior for protected
> data.

s/broken/correct/ :-)

Given that, now you can have efficient access to the symbols that
aren't externally accessible, and the symbols that are accessible are
marked correctly with default visibility, right? So what's the
problem? Why would you want to give up the efficient direct access to
the symbols that can remain protected?

It looks to me like you have a solution, and it's compatible with the
intent behind the feature.

> The actual documented purpose of protected visibility is to prevent
> other definitions of a symbol from taking precedence over the one in
> the library itself.

Um, I'd say that's the documented *meaning* -- the *purpose* is to
enable compiler optimizations.

> For example, suppose you have the following
> situation: mainapp depends on libA which defines symbol foo with
> normal visibility, and libA depends on libB, which also defines foo,
> but intentionally with protected visibility so that libB always uses
> its own definition, not the one from libA. There is no reasonable way
> to obtain the desired semantics here without the current/correct
> behavior for protected data. Any other approaches I'm aware of would
> either allow libB to bind to the wrong definition of foo, or would
> prevent another main app which links libB (but not libA) from being
> able to use the symbol foo from libB.

If libA and libB both export symbol foo (protected or not), you're
playing games with the linker. This is not good programming practice.
It was never our intent to enable developers to play linker games.

At any rate, I fail to see how you get unexpected semantics either
way. With the old (correct to me) behavior, the compiler and linker
can bind the references from within libB to its own foo, and the
dynamic loader never has to get involved. With the new behavior, the
compiler and linker must leave the GOT-indirect accesses to foo, and
rely on the dynamic loader to properly resolve the reference (and
introducing quite a bit of additional complexity in the process).

In addition, libB probably ought to be making foo hidden, not protected.

> On the other hand, there are plenty of other ways to get the
> old/broken behavior if desired. The easiest is to simply use hidden
> visibility when you don't want the symbol to be accessible outside the
> library. If you _do_ want it to be visible to and usable by other
> shared libraries, just not main-programs, this can be achieved using a
> hidden alias for a default-visibility symbol:
>
>         int foo;
>         extern int fast_foo __attribute__((__alias__("foo"),
>                                            __visibility__("hidden")));

"Just not main-programs." If you're willing to ignore main programs,
the copy relocations don't matter either, and what was the issue
again? A hidden alias makes the problem even worse, because now the
linker can't even diagnose the error when it makes a copy relocation
-- you'll simply get one set of code using the copy, and another set
of code using the original (equally broken with old and new behavior).

> I expressed that here explicitly for the sake of clarity but of course
> in practice people use things like the glibc macro-maze to do this
> kind of binding to hidden aliases.

But why go through that, when protected visibility does (used to do)
exactly the right thing?

-cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]