This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
>>>>> On Sat, 20 Dec 2003 23:32:19 -0800, David Mosberger <davidm@linux.hpl.hp.com> said: David> The dynamic relocation count is now down from 747 to 142 (50 David> of them are NONE relocs). David> I'm sure there is more tuning that can be done to minimize David> load-time overhead, but i'll look into those after finishing David> the DWARF unwinder. I figured a way to split the local-only unwinder into a separate library in a way that won't create API/ABI-incompatibilities (except for rather esoteric corner-cases, which won't affect GCC, GDB, or other major libunwind-users). With a separate local-only libunwind.so, the dynamic relocation count shrinks to 72 (32 of which are NONE relocs). If I use LD_DEBUG=statistics, I get the following dynamic reloc counts ("final number of relocations"): no-op program without libunwind: 90 no-op program with libunwind v0.96: 112 no-op program with separate, local-only libunwind: 93 To measure actual execution-time impact, I created a no-op program "empty" whose main() function returns immediately. Then I created a statically-linked "forker" program which spawns "empty" 10000 times. I used LD_PRELOAD to add a dependency on libunwind as desired. The results are below (numbers are execution time in seconds, as reported by "time"): real user system no-op program without libunwind: 7.347 2.401 4.940 no-op program with libunwind v0.96: 8.253 2.858 5.345 no-op program with separate, local-only libunwind: 7.878 2.627 5.250 So, with the local-only version of libunwind, the pretty much absolute worst case overhead of always linking dynamically against libunwind seems to be about 7%. Remember: this is a worst-case which applies only for shared objects which do not link against anything other than ld.so and libc.so. In my opinion, this is a reasonably small overhead (if you really want minimal startup-times for such tiny programs, static linking will give much better results anyhow). For completeness, I attached the profile for the "no libunwind" and the "local-only libunwind" cases below. The caveat for the profiles is that they cover all 10,000 invocations of "empty" and that the call-counts where obtained via sampling, so they're not 100% accurate. Even so, you can see that the call counts are sensible. For example, in the no-libunwind-case, _dl_relocate_object() gets called about 3 times per "empty" invocation (main program, ld.so, libc, I think) and about 4 times for the libunwind-case. I think the only way to essentially eliminate the overhead alltogether would be to use the analogous scheme to the one used in libpthread. That is, provide stub-versions of _Unwind_*() which, when invoked, will dlopen() libunwind.so and re-direct the calls to the appropriate entry-points in libunwind.so. However, to avoid a dependency against -ldl (which would defeat the entire purpose of the stubs), libgcc would have to use __libc_dlopen_mode(), which is probably undesirable. Comments/feedback welcome. --david Profile without libunwind.so: Each histogram sample counts as 533.125u seconds % time self cumul calls self/call tot/call name 35.38 7.94 7.94 322k 24.7u 25.7u _dl_relocate_object 16.46 3.69 11.64 66.3M 55.7n 81.6n _dl_make_fptr 10.89 2.45 14.08 8.99M 272n 467n do_lookup_versioned 7.57 1.70 15.78 41.2M 41.2n 41.2n make_fdesc 4.80 1.08 16.86 42.5M 25.4n 25.4n ld-2.3.2.so:strcmp 4.25 0.95 17.81 21.8M 43.8n 43.8n ld-2.3.2.so:__umoddi3 3.20 0.72 18.53 9.89M 72.8n 72.8n ld-2.3.2.so:_dl_elf_hash 2.55 0.57 19.11 105k 5.44u 97.6u ld-2.3.2.so:_dl_start 2.06 0.46 19.57 8.99M 51.5n 592n _dl_lookup_versioned_symbol 1.25 0.28 19.85 870k 322n 322n do_lookup 1.19 0.27 20.12 195k 1.37u 2.16u _dl_map_object_from_fd 0.87 0.20 20.31 103k 1.90u 91.6u dl_main Profile when pre-loading separate, local-only libunwind.so: % time self cumul calls self/call tot/call name 32.61 8.21 8.21 445k 18.5u 19.4u _dl_relocate_object 14.91 3.76 11.97 67.6M 55.6n 81.6n _dl_make_fptr 13.01 3.28 15.25 9.39M 349n 596n do_lookup_versioned 6.86 1.73 16.97 42.1M 41.0n 41.0n make_fdesc 5.72 1.44 18.41 32.9M 43.7n 43.7n ld-2.3.2.so:__umoddi3 5.06 1.27 19.69 56.5M 22.5n 22.5n ld-2.3.2.so:strcmp 3.24 0.81 20.50 10.4M 78.7n 78.7n ld-2.3.2.so:_dl_elf_hash 2.30 0.58 21.08 99.5k 5.83u 111u ld-2.3.2.so:_dl_start 2.00 0.50 21.59 9.43M 53.4n 725n _dl_lookup_versioned_symbol 1.43 0.36 21.95 296k 1.22u 2.05u _dl_map_object_from_fd 1.41 0.35 22.30 879k 404n 404n do_lookup 0.87 0.22 22.52 88.0k 2.47u 116u dl_main
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |