This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Piecemeal library loading causes slow startup of big apps

On Tue, 2005-09-13 at 18:26 +0200, Lorenzo Colitti wrote:
> Hi,
> as my google SoC project I have been working on improving GNOME startup 
> time, and I see that dynamic linking is one of the culprits.
> GNOME startup is mainly I/O bound, i.e. most of the time is spent 
> waiting for disk seeks. Proof-of-concept work I have done has reduced 
> the disk seeks caused by GNOME itself, but now I have reached the point 
> that most of the disk seeks are caused by loading dynamic libraries.
> This is because libraries are not loaded immediately in one big 
> sequential read, but in bits and pieces. (I think this is because 
> mmap()s the library and only page faults the bits it needs into RAM.) 
> For example, gtk+ (~9MB) is loaded piecemeal in about 30 separate out-of 
> order reads:

How much of GTK+ is actually loaded overall in those 30 reads total?

(I think that 9MB is including debug info ... GTK+ is more like
3MB of code.)

> > (gdm-binary/3150): /usr/local/gnome/lib/ 0-7
> > (gdm-binary/3150): /usr/local/gnome/lib/ 687-718
> > (gdm-binary/3150): /usr/local/gnome/lib/ 653-684
> > (gdm-binary/3150): /usr/local/gnome/lib/ 34-65
> > (gdm-binary/3150): /usr/local/gnome/lib/ 8-33
> > [...]
> > (battstat-applet/4143): /usr/local/gnome/lib/ 447-475
> These are real disk reads traced by hooking into the ext3 block read 
> function using a kernel patch. The format is:
> (process/pid): filename start_4k_block-end_4k_block
> This way of loading libraries visibly hurts performance. If I cat the 
> most frequently-used libraries to /dev/null early in the startup 
> process, I can shave about 10% (~2s) off startup time: reading the 
> libraries puts them in the buffer cache, and when the linker mmaps them 
> it doesn't end up causing seeks.  This is obviously a hack, but I think 
> the process could be made a lot smarter than this.
> For example, would LD_BIND_NOW help me (I suspect not)? Is there a 
> compile-time hint that can tell the linker load the whole library using 
> read() instead of mmap()? If not, could it be implemented?

I think there are multiple places where the problem could be attacked:

 - Code reordering could improve the locality of access within the GTK+
   binary. The more we group used stuff together, the more kernel
   read-ahead does good.

 - 'cat library > /dev/null' type hacks at the beginning of bootup
   for core libraries.

 - Filesystem level - reorder the blocks *on disk* linear, even
   if we are seeking all over in the binary.

 - Being able to mark libraries in some way for and for the
   kernel that this is a "core" library and the first N kilobytes
   should be all be read in sequentially. (Or the entire library,
   in the absence of the ability to reorder junk to to the end)

   This could be based on external knowledge (GTK+ is core for the
   GNOME desktop for example), or could be based on history.

I think it's pretty clear that code reordering should be *part* of the
solution, but it's a little bigger and harder then the others and
is not a complete solution in itself.

Image two applications that are started in parallel at the beginning
of desktop login that cause GTK+ to be paged in in different orders.
No single reordering of GTK+ can deterministically cause linear
seek patterns.

So I think level solutions can even be interesting if code
reordering is (finally) implemented.


Attachment: signature.asc
Description: This is a digitally signed message part

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]