This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Treat RTLD_GLOBAL as unique to namespace when used with dlmopen


Hi all,

I'll add some further details to Carlos's points, plus some 
observations from testing on Solaris.

On 07/16/2015 06:43 AM, Carlos O'Donell wrote:
> Michael Kerrisk and I are working on a man page for dlmopen.
> 
> I have a question, and a proposal for the community.
> 
> We do not allow dlmopen to use RTLD_GLOBAL. Was this really
> intended or simply a QoI issue?

Well, the API comes from Solaris, but does not follow 
Solaris behavior.

> Without RTLD_GLOBAL support in dlmopen it means that
> the newly loaded DSO in the given namespace is always RTLD_LOCAL.
> This seems wrong since it means no DSO loaded via dlmopen can
> be used to provide symbols to subsequently dlmopen'd DSOs in the
> same namespace?

Exactly. 

> Therefore dlmopen at present serves only as a limited way to
> load one library in an isolated namespace along with all of
> the dependent (DT_NEEDED) libraries. It would seem to me that
> RTLD_LOCAL already provides this functionality with the exception
> that such a DSO may get promoted to RTLD_GLOBAL if future dlopen
> calls load a DSO RTLD_GLOBAL that has an implicit dependency
> on the RTLD_LOCAL DSO (DT_NEEDED). In this case the DSO loaded
> RTLD_LOCAL is promoted to RTLD_GLOBAL to resolve the dependencies.
> This breaks the RTLD_LOCAL isolation, and is one of the benefits
> of loading a DSO with dlmopen since at least *that* copy will
> never be promoted to RTLD_GLOBAL.

Correct. And this is not the way that things are on Soalris.

> The clever developer says "No problem, I will dlmopen a stub
> that dlopen's my library with RTLD_GLOBAL" under the impression
> that global search list is unique per namespace. On expects
> this allows the dlmopen'd stub to load several conjoined plugin
> DSOs into the new namspace, having them to resolve their symbols
> against eachother in an isolated way. This fails immediately
> with a sigsegv (see Bug 18684[1]).

This is precisely the use case the Solaris dlmopen() does support:
isolation of load namespaces, while allowing DSOs inside a namespace
to share symbols via RTLD_GLOBAL.
> 
> This trick fails for the same reason that calling dlmopen
> with RTLD_GLOBAL would fail if you removed the check in dlfcn/dmlopen.c
> (dlmopen_doit). When you go to add the DSO to the global
> search list you find there is no search list setup. In the case of
> the application we have rtld setup the global search list.
> 
> Which begs the question? What should the global search list
> be for a new namespace? I propose that the global search
> list for a new namespace should be a copy of the symbol search
> list (scope) of the first DSO loaded into the namespace with
> RTLD_GLOBAL, and subsequent RTLD_GLOBAL loads into the namespace
> add to that list.

The above is what Solaris appears to provide.

> The Solaris documentation is silent on exactly what should happen
> in this case. 

Yes, but notably the Solaris documentation does not explicitly
prohibit the use of RTLD_GLOBAL with dlmopen(). The Solaris
documentation says:

     The dlmopen() function is identical to dlopen(), except that
     an identifying link-map ID (lmid) is provided. This link-map
     ID informs the dynamic linking facilities upon  which  link-
     map  list  to  load  the  object.

> Since an alternate interpretation could be: All objects,
> regardless of namespace (link map list) loaded with RTLD_GLOBAL are
> available for symbol resolution for any objects. In which case
> dlmopen with RTLD_GLOBAL makes no sense, other than perhaps symmetry
> with dlopen, because the namespace isolation is lost. This still doesn't
> solve the most compelling use case of an isolated set of dlmopen/dlopen
> plugins with their own global search list.

And, in my testing, the above is *not* what Solaris does.

> The proposed interpretation of RTLD_GLOBAL for dlmopen would allow:
> 
> * Use dlmopen with RTLD_GLOBAL, making the symbols of the first
>   object loaded into the namespace immediately available to
>   subsequent DSOs loaded in constructors or other dlopen implicitly
>   into the namespace.
> 
> * Use dlopen RTLD_GLOBAL to make symbols available for resolution
>   only within the namespace the caller was in.
> 
> * Allows complete isolation of a group of dependent DSOs, either
>   via DT_NEEDED dependencies or via dlopen or subsequent dlmopen.
>   This isolation allows plugin virtualization via dlmopen.

The above is what Solaris seems to provide.

> Attached is a patch that fixes this for master. I still need to write
> something like a dozen tests to show that this works as expected in
> all the cases, but so far every test I've written works and doesn't
> regress anything.

I've not yet had a chance to test this patch. Carlos, you may wish
to try my code examples, and check how things look compared to Solaris.

One other deviation that I note from Solaris. The dlopen() man page
currently says:

       If filename is NULL, then the returned handle is  for
       the  main  program.

And this is what glibc currently does *regardless* of the namespace
from which the dlopen(NULL, flags) call is made. But, in the context
of dlmopen(LM_ID_NEWLM) namespaces, I'd expect this call to return 
something like "the root of the this namespace". And that is what
Solaris appears to do.

> Obviously not for 2.22, but 2.23 material, along with Michael's
> new dlmopen/dlinfo man pages we should be ready to help developers
> use such a feature more extensively. At present I find almost no
> code using dlmopen in userspace because it has languished as an
> unsupported undocumented feature (Bug 15971, Bug 15271, and Bug 15134
> all need fixing).

I would said "... because it currently serves no useful purpose".
The dlmopen() seems to have been added to Solaris to support
precisely the use cases that Carlos describes, and the glibc
implementation doesn't support those cases at all.

The attached tarball contains a short build script that creates a few
shared libraries from (mostly) simple (and commented) source files.

The overall structure is as follows:

    main():

        1. Loads libabc.so with either dlmopen() or dlopen() and 
           with either RTLD_GLOBAL or RTLD_LOCAL, depending on the 
           command-line arguments. If no arguments are provided, the 
           default is dlmopen(..., RTLD_GLOBAL);

        2. Invokes abc_start() in libabc.so

    abc_start():
        1. Loads some other shared libraries using different
           combinations of dlmopen() and RTLD_GLOBAL vs RTLD_LOCAL.

        2. Invokes a function qrs_start() in the libqrs.so
           library.

    qrs_start():
        Looks up (dlsym()) various symbols in the other shared
        libraries and reports on success or failure of the lookups.

    main():
        Control eventually returns to main(), and it then looks up
        some of the same symbols as qrs_start() and reports on
        success or failure of the lookups.    

The program produces log messages that should make the results 
reasonably easy to interpret. Annotated output from a sample
run follows.

---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
$ uname -a
SunOS login 5.10 Generic_150400-17 sun4v sparc SUNW,SPARC-Enterprise-T5220
$ sh build.sh && ./main
main(): lmid from dlopen(NULL) is 0 (handle = 0xff3634d8)
main(): dlmopen LM_ID_NEWLM ./libabc.so   RTLD_GLOBAL
main(): lmid from dlopen("libabc.so") is -13222656 (handle = 0xff371560)
main(): invoking abc_start()
    Called abc_start()
# Note in next line that dlopen(NULL) gave us back a handle for something
# other than initial NS. Linux differs on this point.
    abc_start(): lmid from dlopen(NULL) is -13222656 (handle = 0xff173690)
    abc_start(): dlmopen LM_ID_BASE  ./libdef.so   RTLD_GLOBAL
    abc_start(): dlopen              ./libjkl.so   RTLD_GLOBAL
    abc_start(): dlopen              ./libmno.so   RTLD_LOCAL
    abc_start(): dlopen              ./libqrs.so   RTLD_LOCAL
    abc_start(): invoking qrs_start()
        Called qrs_start()
        qrs_start(): lmid from dlopen(NULL) is -13222656 (handle = 0xff173690)
        qrs_start(): lookup of "abc" succeeded   # In this NS, with 
        qrs_start(): lookup of "def" failed      # Was loaded into initial NS
        qrs_start(): lookup of "jkl" succeeded
        qrs_start(): lookup of "mno" failed      # Was loaded with RTLD_LOCAL
        qrs_start(): lookup of "main" failed     # Is in initial NS
# Now do some lookups from initial NS
main(): lookup of "abc" failed                   # In another NS
main(): lookup of "def" succeeded                # Was loaded into initial NS
main(): lookup of "jkl" failed                   # In another NS
main(): lookup of "mno" failed                   # In another NS (+ RTLD_LOCAL)
---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Attachment: dlmopen_expt.tar.gz
Description: application/gzip


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]