This is the mail archive of the libc-hacker@cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

more verbose reply to ld.so map problem


Several people are still not convinced that the currently implemented
solution for the ld.so map is correct so I try to explain it once
more.  Please keep in mind that I'm not only talking about the current
behaviour but also what will come sometime soon.

The status quo is:

  ld.so exports no symbols beside various __* symbols (to implemnt
  _dl_open et.al. in the libc itself) and malloc&friends.  The latter
  is necessary because_dl_catch_error etc.  pass up malloc strings which
  must be freed.  Ideally only this code uses malloc and only if passed
  to the libc and not when passing inside ld.so.


The differences to the old behaviour are:

- one cannot replace functions used in the dynamic linker using LD_PRELOAD
  or in other shared objects.

- stub definitions in ld.so are not replaced by the real definitions in
  libc.so once they are available

- ld.so is faster (significantly for large binaries with lots of relocations)

- the behaviour now matches more closely the behaviour of static binaries.
  The ld.so is only a mean to get things running.  It should be completely
  invisible.

Summarized:

  one loose the capability to exploit the internals of the ld.so with
  some cool hacks for the benefit of a faster and more reliable loading
  process.


I would think even this is convincing: a rarely (if at all) used
feature which is not very portable requires that every program on the
system is running slower.  I think this is not the right way.
Especially since I've seen already code using LD_PRELOAD where one has
to work around the problem that ld.so might use the functions).


But this is not all.  I think Mark wrote that `_open' is exported in
Solaris' ld.so.  Well, that might but this is not equivalent to
allowing overwriting these functions.  Just to make sure I'm not only
babbling I've compiled the following C code into a shared object and
preloaded it using LD_PRELOAD.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int
_open()
{
  _exit (1);
}
int
open()
{
  _exit (1);
}
int
__open()
{
  _exit (1);
}
int
strcmp()
{
  return 1;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using truss you can see that that the object is loaded and that there
are later open() calls.  You can also see that you have the desired
effect for programs (not the ld.so).  But you will also see that the
ld.so itself does *not* use any of these functions.  To show it very
clearly, here's the test program.  It shows the problem very clearly.
It fails with the old implementation but succeeds on Solaris (I'm,
btw, am referring to Solaris 7).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <dlfcn.h>
int
main()
{
  printf ("hello world\n");
  dlopen("./z.so", RTLD_NOW);
  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Simply run (the first code is in y.so, the second in w.c):

	gcc -o y.so -shared -fPIC y.c -g -O
	cp y.so z.so
	gcc -o w w.c -g -O -ldl
	strace env LD_PRELOAD=./y.so ./w

[A sidenote: the Solaris implementation of ld.so is quite complicated
and more featureful then glibc's version.  At least for now.  E.g.,
they have something called "groups" in the ld.so.  The groups consist
of different lookup scopes and the LD_PRELOAD handling is tightly
coupled with it.  I'm about to implement this as well.]


Now some people argued that only the functions used in the runtime
lookup procedures have to be treated this way.  While this is true and
therefore strcmp (I think the only library function used) must in any
case be hidden the assumptions made on ld.so are too strong.  Some
weeks ago I've mentioned a few extensions which Solaris 7 has and
which I'll implement once I have the time.  One such feature is
dealyed loading of shared objects.  What this does is that not all
dependencies of a bianry are loaded immediately.  Instead loading of
specially marked shared objects is delayed.  Once a symbol from this
library is needed the library is loaded and the program resumes as
usual.  This is also implemented in Irix 6 and I think also HP/UX 11.

The consequence for our problem is that for the runtime relocation not
only a single function (namely strcmp) is used.  But instead all of
the code of ld.so can possibly be called (except for startup code of
course, but here you don't use other shared objects anyway).
Therefore to avoid recursions and also deadlocks no function in ld.so
must any version from another library, no preloading is allowed.  But
this means that we don't have to use the PLT at all.


A last argument is that now one would not anymore able to use, e.g., a
special strcmp copy which performs extra tests.  This argument is
bogus right from the start.  The reason why I haven't seen the
reported problem at all is that many function calls are inlined
anyhow.  It is really very wrong to assume and expect anything about
the libc implementation.

Well, this is at least true for the runtime version of libc.so.  I've
explained a few times already that I would like to have a special
debug version of the library.  This one would catch all kinds of
errors and would be an ideal debugging mean.  But things like this do
not belong into the runtime library.


If someone still thinks that we have to change the behaviour please
provide a real-world example of the possible use and a solution for
the problems mentioned above.

-- 
---------------.      drepper at gnu.org  ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com   `------------------------


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]