This is the mail archive of the
mailing list for the glibc project.
Speeding up the dynamic linker with 100s of DSOs?
- From: Andrew Chatham <chatham at google dot com>
- To: libc-alpha at sources dot redhat dot com
- Date: Wed, 4 Jan 2006 13:31:31 -0800
- Subject: Speeding up the dynamic linker with 100s of DSOs?
We have some binaries that get linked with an absurd number of dynamic
shared objects (up to 700 or so), and it can take a very long time for
these programs to execute. It can take 30 seconds just to get into
main(), because it takes so long to resolve the global constructors.
Each symbol resolution goes through the list of DSOs, testing each
hash table, until it resolves the symbol.
We could improve this in our code by making more symbols static or by
collapsing the DSOs, but those are difficult to push through, so we've
been looking at the dynamic linker. It sounds like other people have
had similar problems, too.
One of my colleagues, Michael Chastain, tried increasing the number of
buckets in the ELF hash table in the DSOs, and that improved things
somewhat. Then I tried computing bloom filters for the DSOs as they
were loaded by the dynamic linker, which seemed to be speed things up
a bit more.
At this point, it takes 7 seconds to reach main() instead of 30, which
still isn't so great, but it's a start. At this point, my patch is
messy and leaks memory and may not work with dlopen, and we haven't
gotten our copyright agreement sorted out yet, so I can't share the
code right now. But I wanted to vet some ideas before continuing with
Even with the bloom filters, resolving a symbol is still O(n) in the
number of DSOs; I just improved the constant factor. I had some ideas
for speeding up resolution against linked-in libraries (we'd just
handle dlopened DSOs the slow way), but I didn't figure out when the
dynamic linker is done loading those. If I had that, I could either
build some larger filters to allow for a binary search or just build
one big hash table over all linked-in DSOs and search against that.
Do any of these approaches sound reasonable? Separate bloom filters
per DSO (safe, almost done)? Hierarchical filters to allow binary
search? One big hash table?