This is the mail archive of the mailing list for the Archer project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: FYI GDB on-disk .debug cache (mmapcache) [Re: Tasks]

On Sat, 09 Aug 2008 00:19:18 +0200, Tom Tromey wrote:
> >>>>> "Jan" == Jan Kratochvil <> writes:
> Jan> I would try dwarf2_build_psymtabs_easy() myself now, so far just
> Jan> with the public symbols (regressing GDB).  If it would be fast
> Jan> GCC can provide even indexes for the static symbols.
> I re-did my profiling on the pubnames case.  The do-nothing
> dwarf2_build_psymtabs_easy does cut down the CPU time a lot.
> It does still read the contents of debug_pubnames, so the mystery time
> does not disappear:
> /usr/bin/time using _hard: 45.73user 2.74system 1:18.35elapsed 61%CPU
> /usr/bin/time using _easy:  8.84user 3.01system 0:56.95elapsed 20%CPU
> Of course it is hard to say what the improvement would really look
> like when the _easy stuff is actually in place.  20 seconds maximum
> improvement here ... that is nice but hardly in "awesome" territory.
> It seems weird that the elapsed time does not vary as much as the user
> time.  I wonder what that means.

I still think you were seeking with the disk, right?  I borrowed one big iron
box in RH with 32GB of RAM.  F9.x86_64 ooffice with all the document types

_easy(): (211 .so libraries symbols read)
9.52user 2.22system 0:11.75elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
88.7%: elf_symfile_read
43.2%: d_demangle (& the associated d_print* inefficiencies)

_hard(): (210 .so libraries symbols read)
29.54user 2.30system 0:31.87elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
non-cached (sync; echo 3 > /proc/sys/vm/drop_caches) - "small iron":
39.34user 4.47system 1:06.66elapsed 65%CPU (0avgtext+0avgdata 0maxresident)k

This means for big iron (machines with many GB of RAM):
* We can save almost 63% of the time by implementing _easy() for gdb+gcc.
* From the remaining 37% we still can save a lot by optimizing the symtab
  reading CPU overhead.
(both cases without introducing cache files)

With _easy() we will do a full read of only a neglible count of CUs.

For small iron (less than 1GB disk-cache + 1.4GB GDB):
* Cache files may improve it (like I attempted with mmapcache myself).
* SSDs (flash drives) - if we can assume them - have no seek time making the
  small<->big iron difference less a pain.

For the disk cache files (like my mmapcache) possibility for non-SSD drives
for OOo with about ~200 shared libraries a single seek to the premapped cache
file makes 22ms*200==4.4s (22ms for lseek() with all its ext3 overhead).
Therefore the cache file may make sense to be a single file per execfile than
a cache file per each objfile.

> So -- at least to me it is not obvious what to do.  Hiding the time
> (not reading anything until needed) is nice for attach, but, I think,
> won't hugely improve the user experience (unless we can somehow also
> avoid reading a lot of the data in all cases).

_easy() should avoid us reading almost all of the data.

> Maybe we could do something like your patch, but rather than mmap the
> data structures, store compressed data structures, on the theory that
> we would trade size on disk for some cpu.

Sure even the _easy() and symtabs reading could be optimized by some separate
cache files but this should be only the next step afterwards.

> Another thing I am curious about is seeing how elfutils fares on this
> case.

Besides elf_symtab_read() in general I find BFD under 2% with _easy().


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]