This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
I just checked in a rather large patch which removes a lot of relocations. This is before the patch: libc.so: 1609 relocations, 1317 relative, 502 PLT entries This is afterwards: libc.so: 1609 relocations, 1484 relative, 413 PLT entries Executing a null program with ld.so statistics enabled shows the following. Before: 18957: runtime linker statistics: 18957: total startup time in dynamic loader: 1215464 clock cycles 18957: time needed for relocation: 546656 clock cycles (44.9%) 18957: number of relocations: 200 18957: number of relocations from cache: 114 18957: time needed to load objects: 379052 clock cycles (31.1%) 18957: 18957: runtime linker statistics: 18957: final number of relocations: 206 18957: final number of relocations from cache: 114 Now: 18989: runtime linker statistics: 18989: total startup time in dynamic loader: 1178256 clock cycles 18989: time needed for relocation: 467668 clock cycles (39.6%) 18989: number of relocations: 140 18989: number of relocations from cache: 7 18989: time needed to load objects: 379492 clock cycles (32.2%) 18989: 18989: runtime linker statistics: 18989: final number of relocations: 146 18989: final number of relocations from cache: 7 This means: - 89 functions defined in libc were also called by names which are exported resulting in PLT entries. Avoiding this not only gets rid of the JUMP_SLOT relocations (transformating them to relative relocations), it also allows to generate better code. - 177 non-JUMP_SLOT relocations were converted to relative relocation (partly overlapping with the PLT optimization) The performance improvements in ld.so are measurable. Timing the null program shows before (these are the cycles reported by ld.so): minimum: total=1369336, relocs=536704, load=325680 average: total=1402050, relocs=544385, load=335134 Now; minimum: total=1259892, relocs=440732, load=314832 average: total=1292682, relocs=451006, load=326384 I.e., ld.so spends about 100000 cycles less on relocations. This is directly visible in startup time improvements. Before: minimum: 0.001447713 sec average: 0.001488027 sec Now: minimum: 0.001389521 sec average: 0.001425523 sec If you do the math you'll see that my machine runs at 1.7GHz. The time improvements are not that impressive but it's a fast machine and there is more to come and the percentage gains are impressive (about 8% overall, 12% if you exclude the time the kernel is loading some files). And we are not through yet. So so I've concentrated on libio and RPC, both fairly closed sets of code and the files are not used individually. There are still 413 PLTs in use. Only a few (e.g., for the thread functions) are really needed. I few others, like malloc, will be kept for interposition. All the rest can go. This means speedup and size reduction (which by the way was about 2k so far). -- ---------------. ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com `------------------------
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |