This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Removal of demangled names from partial symbols

To: dberlin at redhat dot com (Daniel Berlin)
Subject: Re: Removal of demangled names from partial symbols
From: "Peter.Schauer" <Peter dot Schauer at regent dot e-technik dot tu-muenchen dot de>
Date: Mon, 4 Dec 2000 21:53:40 MET
Cc: gdb at sources dot redhat dot com, gdb-patches at sources dot redhat dot com

: Why do we not do a lookup_minimal_symbol in
: a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
: and if we get  back a symbol, use the demangled name from that,
: otherwise, demangle it.

For some symbol formats (e.g. a.out) the linkage and debugging symbols are
intermixed. By the time you want to record a partial symbol, the minimal
symbol might have not been seen yet. Or the minimal symbol has been seen,
but the minimal symbols are not yet installed, so lookup_minimal_symbol
will fail.

You might be able to work around this by going over all psymbols and fill
in the demangled name via lookup_minimal_symbol _after_ the minimal symbols
are installed, but I am not yet convinced that you don't have to pay your
price on slower systems. After all, 5 secs to 6 secs is a 20% slowdown.

> Demangled names were removed from partial symbols to speed start up
> times a few years ago.
> 
> However, with the minsym demangled hash table now around, we demangle
> all minimal symbols when we install minimal symbols (IE we init the
> demangled name on them,unconditionally).
> 
> Since the minimal symbol table ends up including a large subset of the
> mangled partial symbols (if not all of them), this means we already have a large
> subset of the partial symbol names demangled for us at start up
> anyway.
> 
> Why do we not do a lookup_minimal_symbol in
> a new function, add_psymbol_and_dem_name_to_list, on the mangled name,
> and if we get  back a symbol, use the demangled name from that,
> otherwise, demangle it.
> 
> Even tests on 100 meg of debug info show we barely add any startup
> time at all (5 seconds without, 6 seconds with) . 
> In fact, all added startup time is attributable to the
> fact that to save memory, I had it bcache the demangled name in
> SYMBOL_INIT_DEMANGLED_NAME.  If you don't bcache it (like right now),
> it's in memory  in at least the full symbol, and the minimal
> symbol (it's  actually in memory once for every time
> SYMBOL_INIT_DEMANGLED_NAME is called on a symbol, and the demangling succeeds).
> 
> I think 1 second on 100 meg of debug info is worth it to not have to
> linear search on every symbol lookup, which is amazingly 
> slow, and if you have gdb using swap at all because of the number of
> symbols, you are almost guaranteed to hit the swap 
> hard on *every* single lookup, since we have to go through every
> single symbol. 
> 
> This would solve the problem of not being able to lookup partial
> symbols by demangled name, and allow us to binary search them without
> fear of missing a symbol.
> 
> Would this be acceptable?
> 
> My next trick after that would be to add a mangled->demangled mapping
> structure, if it's necessary to improve speed, and just use that to
> lookup the names before demangling the 
> name over again, in cases where we do (ie SYMBOL_INIT_DEMANGLED) need
> to find a demangled name for a mangled one, and use that
> rather than the minimal symbol table to try to find the name.
> The reason for this is that a hash table (in this case, we are
> using the minimal symbol demangled hash table as a lookup table) is the wrong structure
> for this, since demangled names can be *very* large (average of 82
> chars on my large C++ programs), and we always have to hash the entire
> string, then do a whole bunch of string compares, because the chains are
> long. This is okay when we hit (except for the long chains), but on
> misses we waste the same amount of times as hits, if not more. The
> string compares on hits also cost a lot because of the length of the string.
> We really should use a ternary search tree or some structure like it,
> which on hits is actually faster (since we don't need multiple
> string compares), and on misses is a whole ton faster, since we abort
> much sooner.
> 
> --Dan
> 
> 
> 


-- 
Peter Schauer			pes@regent.e-technik.tu-muenchen.de

Follow-Ups:
- Re: Removal of demangled names from partial symbols
  - From: Daniel Berlin
- Re: Removal of demangled names from partial symbols
  - From: Daniel Berlin

References:
- Removal of demangled names from partial symbols
  - From: Daniel Berlin

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]