This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Unexec gurus?



Just for kicks, I thought I'd offer an explanation of *why*
initialized variables in dynamically linked libraries get overwritten
when you dump and re-exec.

Suppose you compile foo.c to foo.o file, and foo.c references some
global int variable, var.  The compiler has no idea whether the definition of
var is going to come from some other .o file, or some shared
library.  So it emits completely straightforward code, which assumes
that the address of var is some constant, to be resolved at static
link time.  The .o file will contain appropriate relocs, telling the
static linker how to plug in var's address once it knows it.

Now, suppose it turns out that var lives in a shared library, which
provides an initialized value for it.  We won't know the shared
library's address until run-time, when the dynamic linker will load it
wherever it pleases.  But we can't go through the executable and patch
up all the references to var at run-time, because that would make the
executable unshareable, to some extent, which works against the whole
purpose of shared libraries --- to save memory.  And we can't make the
executable reference var through some indirection table, since we must
compile foo.o before we realize var is coming from a shared library.
So we're kind of stuck.

To get around this, the static linker does something kind of odd.  It
consults the shared library to find var's size (which it can find at
static link time), reserves that much space in the executable's data
segment for it, and resolves all the references to var to reference
that space.  Then it places a "copy reloc" for var in the executable's
list of dynamic relocs.

At run-time, the dynamic linker sees the copy reloc, and copies the
contents of var from the shared library --- whereever it is --- into
the reserved space.  Now all the references to var in the executable
are pointing to the right thing, and it's ready to go.  The copy of
var in the shared library is never used again.

For the sake of references to var in the shared lib itself, the
dynamic linker stores the newly initialized copy's address in the
global offset table or someplace like that; I forget.  The shared
library itself was compiled with the -PIC flag, so any references to
var in the shared library will go through this table.

So, in general, if you unexec an executable linked dynamically against
some shared library, any variable initialized in that shared library
will get re-initialized each time you run the dumped executable.

It's exactly this sort of crap that made me so hesitant to support
unexec.  The idea of unexec is seductively simple, and it'll seem to
work, but it runs afoul of the modern run-time environment in so many
different ways, you're just asking for trouble.  And it's neverending:
today, it's this; tomorrow, you'll find something else.  It would be
much better if we could find some way to get instantaneous startup
that doesn't require us to violate the run-time's abstractions so
utterly.  That's my opinion.

Setting aside unexec completely, it's interesting to think about how
this whole copy reloc thing might fail.  Think about what happens when
you link an executable against a shared library, and then run it
against a newer shared library where var has become bigger...  Oops.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]