This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug dynamic-link/19884] Discrepancy between documented and actual search path for shared libraries


https://sourceware.org/bugzilla/show_bug.cgi?id=19884

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carlos at redhat dot com

--- Comment #9 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Nathaniel J. Smith from comment #0)
> I was surprised recently to discover that if you have the following
> situation:
> 
> Files:
> 
>   a.out
>   A/A.so
>   A/libncurses.so.5
>   B/B.so
> 
> A/A.so has RUNPATH=$ORIGIN
> A/A.so is linked against (DT_NEEDED) libncurses.so.5
> 
> B/B.so has no RUNPATH set
> B/B.so is linked against (DT_NEEDED) libncurses.so.5
> 
> Execution flow:
> 
> run program a.out, which
> 1) dlopen A/A.so with RTLD_LOCAL
> 2) dlopen B/B.so with RTLD_LOCAL
> 
> Then: B/B.so will get loaded linked against A/libncurses.so.5, rather than
> the system libncurses.
> 
> OTOH, if a.out is modified to swap the order, so that it instead does:
> 1) dlopen B/B.so with RTLD_LOCAL
> 2) dlopen A/A.so with RTLD_LOCAL
> 
> then both A.so and B.so will end up linked against the system version of
> libncurses.so.5, and A's RUNPATH will be ignored.

This is as expected.

You may only have 1 copy of a SONAME library in the in-process memory image in
a given namespace at a time.

The first loaded copy of SONAME libncurses.so.5 will be used for all other
DT_NEEDED resolutions.

Even though the first loaded libncurses.so.5 won't be used for relocation and
symbol references (RTLD_LOCAL), the DT_NEEDED from the library itself will mean
that it has libncurses.so.5 added to it's own search scope.

> I find this behavior very surprising -- I expected that A.so's RUNPATH would
> be respected, so that A.so always linked to A/libncurses.so, while B.so's
> lack of RUNPATH would also be respected, so that B.so always linked to the
> system libncurses.so.5. 

The golden rule for an ELF link namespace: The first loaded library wins.

This is precisely the reason why you can LD_PRELOAD a new malloc, otherwise
your suggested "fix" would break using tcmalloc, jemalloc and other alternate
allocators.

> The reason I expected this is that (1) AFAICT, all
> available documentation says that what I expected to happen is what should
> have happened (e.g. ld.so(8) clearly documents the library search order, and
> doesn't say anything about this; Drepper's dsohowto.pdf AFAICT also seems to
> say that what I expected to happen is what should have happened), and (2)
> the actual behavior is very weird and undesireable (IMO).

If you want B.so to use a distinct libncurses.so.5 that means you want _two_
copies of the same potentially conflicting library in the same in-memory
process image, and that's dangerous.  It's dangerous because it means you can't
share ncurses data between A and B, and if you do, they will operate on
different ncureses instances of the library.

The only way to do what you want with more isolation is to use dlmopen, which
was designed for this purpose. However, today, dlmopen is not yet fully
supported in glibc, and will take a while before it is. With dlmopen you create
a new link namespace and loading B.so with dlmopen will search all over again
for libncureses.so.5 without using the on already present and pulled in by
A.so.

> In every other
> way, A.so and B.so are isolated from each other by being loaded with
> RTLD_LOCAL

That is not isolation.

And be careful that RTLD_LOCAL may be promoted to RTLD_GLOBAL if another dlopen
references A.so with RTLD_GLOBAL.

The only way to get isolation is via dlmopen.

> -- they get independent ELF scopes, and in particular
> LD_DEBUG=scopes seems to indicate that we actually end up with two different
> instances of libncurses.so.5 -- they're both loaded from the same file, but
> because they're loaded into different ELF scopes they might act differently.

This is not true. The scopes are just used for symbol resolution and relocation
information lookup. There is only one instance of the library loaded.

> (E.g., if A.so interposes some symbol in libncurses, then when A.so calls
> into its copy of libncurses then libncurses might end up calling back into
> A.so; but when B.so calls into its copy of libncurses then will never call
> back into A.so.)

Again this is not true.

You are in the single global link namespace.

If A.so interposes symbols during the relocation processing of libncurses.so.5,
then calls from B.so into libncureses.so.5 may eventually call functions in
A.so that were interposed.

The only solution you have is dlmopen (when we get it finished).

> OTOH, looking at elf/dl-load.c:_dl_map_object it seems like the current
> behavior might be intentional. At the very least this is a documentation bug
> -- Windows has somewhat similar behavior (if some DLL with a given basename
> has been loaded, then attempting to load another DLL with the same basename
> will short-circuit all the normal library searching and simply return the
> previously loaded DLL, even if it's no longer on the search path), but on
> Windows at least this is well documented.

I agree we need better documentation. Patches welcome for the glibc manual or
Linux kernel man pages.

> (The actual situation that led to the discovery of this issue is that we're
> trying to package up Python extensions into self-contained bundles that can
> run on many different linux systems, which involves vendoring libraries like
> libgfortran. And to our surprise we found that two independently distribute
> Python extensions that each had their own vendored copy of libgfortran can
> interfere with each other, or with locally-compiled Python extensions that
> expect to use the system libgfortran. See:
> https://mail.python.org/pipermail/wheel-builders/2016-March/000069.html)

Correct. You need dlmopen.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]