This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: the global symbol table


On Wed, 25 Sep 2002 16:52:20 -0400, Elena Zannoni <ezannoni@redhat.com> said:

> I wonder if rewriting the symbol table is really necessary in order
> to provide namespace support.

I'm not sure if it's strictly necessary, but I think they share enough
in common to make it a good idea.  Namespaces, like global symbols,
have the property that they gather together symbols defined in many
different blocks (indeed, in many different files) and make them all
accessible in the same place.  To do either of them right, you (at
least currently) have to deal with not only symbols but also partial
symbols and perhaps minimal symbols.  It might even be the case that
unnamed namespaces are to named namespaces as static symbols are to
global symbols, though that analogy might be difficult to express in
code.

So, at the very least, I have to see how global symbol lookup
currently works, so I know what the possible pitfalls are.  But it
would be a shame if I couldn't find pieces of code that I could use in
both places.  I'd be pretty surprised if it ended being the case that
_all_ of the global symbol code ended up being a special case of
namespace code - my guess is that one or both cases will have enough
quirks to require some special handling - but there should be chunks
of code that can be used in both cases.  Also, using the same code in
both places will help me a lot with testing: to the extent that
looking up global symbols is the same as looking up names within a
specified namespace, then the entire GDB testsuite will help test the
namespace code.

Of course, I don't want to change the global symbol code more than
necessary; and, in fact, one reasonable option would be to leave the
global symbol lookup code working almost exactly as it is, maybe
taking out some chunks of it and putting them in separate functions
but without changing the algorithms, and then build namespace support
on top of that, and only then to think about improving the algorithms.
One issue there is that the algorithm for global symbol lookup is
pretty bad (since it always examines each symtab), so it seems to me
that it's likely that that will get changed in the not too distant
future; if changing it now would make adding namespace support easier,
then changing it now would be a good idea.

> One thing I found useful doing when I was trying to get a 500ft view
> of the symbol table, was to look at how the symbol readers interface with
> each other and the rest of gdb.

> I looked at the functions that each module provided and noticed that a
> few were a bit confused, as to where they belonged, which file
> exported what, etc.

> I think that an audit of the interfaces could help here as well.
> Just a suggestion. And I hope to have time on the weekend to catch
> up.

Thanks for the suggestions.

> the hppa stuff (i think, if my memory is not failing me) relates to the
> SOM stuff that is HP's proprietary format. Maybe look in somread.c.

Thanks for the tip.

>> * Even the simplest case of searching all the global blocks isn't
>> implemented very well: surely we need some sort of fast lookup
>> structure that won't require us to look at every single symtab.
>> Assuming that we have some sort of expandable data structure in
>> which lookups are fast, then exactly how should this work?  Should
>> we keep replace the current global blocks by one big global table,
>> should we have a global table that duplicates the information in
>> the global blocks (so struct symbols corresponding to global
>> symbols would be stored in two separate places), or should we have
>> a global table that quickly maps names to symtabs, but have the
>> actual symbols only stored in the global blocks of symtabs?  I'm
>> currently leaning towards the second solution, with the possible
>> longer-term goal of migrating towards the first solution, but it's
>> not clear to me exactly what the consequences of these choices are.

> I thought we were trying to make gdb footprint a bit smaller. So I
> would be against duplicating information. But maybe I am not
> understanding your proposal.

I certainly wouldn't want to make GDB's footprint any bigger without a
reason to do so.  But having the same symbols accessible via two
different ways wouldn't make GDB's footprint all that much bigger (I'm
not proposing having two copies of each global symbol, just having
each global symbol appear in two tables), and if it turned out that
there was a compelling reason for it to be easy to quickly get at the
global symbols within a given symtab as well as all to quickly get at
the global symbols at once, then having different tables for those
roles is one easy way to do it.

>> * Which of the structs symbol, partial_symbol, and minimal_symbol can
>> be unified?  I tend to think that partial_symbols should go away,
>> to be replaced by special 'incomplete' sorts of struct symbols
>> combined with code in lookup_symbol (or wherever) that says that,
>> if you run into one of those kinds of incomplete symbols, read in
>> the entire symbol table for that file and flesh out all of its
>> partial symbols.  (This could open a door to a general notion of
>> incomplete symbols that could, depending on the debugging format,
>> be completed in a more efficient manner than the current
>> psymtab->symtab mechanisms.)  I was hoping that minimal_symbols
>> could also be turned into special kinds of symbols, but now I'm
>> more dubious about that; if they have to remain separate, I should
>> presumably tweak dictionaries to let them index any sort of struct
>> general_symbol_info, and change the minimal symbol table in ways
>> that correspond to the changes to the global symbol table.

> We could eliminate partial stabs for dwarf2, (I think that was the
> plan) but I am not sure you want to do that for other symbol
> readers.

At some point, I'll try to say a little more coherently what I mean to
do here.  Certainly I want to be able to work with the existing
psymtab_to_symtab code; it's just that I'd like to work with that in
such a way that I have to search in as few places as possible when
looking up a name.  So I want to bury the psymtab->symtab translation
down as deep as possible.

To put it another way: right now, lookup_symbol might look up names in
global symtabs, global psymtabs, static symtabs, static psymtabs, or
minimal symbols.  A name found in a global psymtab is just as good as
a name found in a global symtab (so whether you search the psymtabs or
the symtabs first should be an optimization issue, not a semantic
issue); but a name found in a static psymtab or symtab is worse than a
name found in a global one.  And a name found in minimal symbols is
its own strange thing; I'm not sure where it fits into this
hierarchy.  So that suggests that merging searching in psymtabs with
searching symtabs might be a reasonable idea to consider.

Eliminating the use of psymtabs in DWARF 2 is really a side issue, for
future consideration.  I don't want to make doing that any harder than
it would be now, and my guess is that any work that I do to merge data
structures that naturally belong together will probably help that
effort, but that can definitely wait.

>> * So just what's going on with the usage of minimal symbols in
>> lookup_symbol?  Are there any situations where looking in the
>> minimal symbol is actually helpful?  If so, do those only have to
>> do with ickiness related to mangled names, or is there a more
>> serious issue that I'm missing?

> debugging something compiled w/o -g. 

Right, I was wondering about that.  I know that, if I run GDB on a
program without debugging info and do 'break main' then it still
works.  But I can't see how lookup_symbol ever actually produces a
struct symbol from a situation where there isn't debugging info.
(Does lookup_symbol ever get called in that situation?  I'll have to
try it and see what happens.)

Obviously I have more code browsing to do.  But, because of debugging
w/o -g, it certainly seems to me that minimal symbols can't go away.

>> * One issue that I didn't comment on above is that lookup_symbol has a
>> 'symtab' argument that stores the symtab in which the symbol is
>> found.  If we have to support that, that obviously affects the data
>> structures involved.  But it seems to me that we _don't_ have to
>> support that: I did a cursory look through GDB's sources and, as far
>> as I can tell, that argument is only used by linespec.c's
>> decode_line_1.  So it seems to me that we should remove that
>> argument from lookup_symbol and provide some sort of alternate
>> functionality to satisfy decode_line_1's needs.
>> 

> Oh our favorite function. not. in theory maybe, but tweaking with
> decode_line_1 has always provided quite unexpected interesting
> consequences.

Here's what I was thinking might be reasonable: delete the SYMTAB
argument to lookup_symbol, and add a new function

extern symtab *lookup_symtab_symbol (struct symbol *sym, struct block *block);

that returns the symtab where SYM is found if you look for its name in
BLOCK.  (Or, I guess, NULL if the first argument is NULL.)  Then
existing calls like

sym = lookup_symbol (name, block, namespace, is_a_field_of_this, &symtab);

could be converted to the two calls

sym = lookup_symbol (name, block, namespace, is_a_field_of_this);
symtab = lookup_symtab_symbol (sym, block);

And, of course, existing calls like

sym = lookup_symbol (name, block, namespace, is_a_field_of_this, NULL);

could just be replaced by

sym = lookup_symbol (name, block, namespace, is_a_field_of_this);

That way, the semantics of decode_line_1 would remain unchanged, but
other uses of lookup_symbol would be faster if it turned out that it
were possible to speed up lookup_symbol as long as you didn't have to
identify the symtab in question.

Though the speed issue is, for me, not the only important issue: I
just get nervous when functions have arguments that are almost never
used.  So I don't like the IS_A_FIELD_OF_THIS argument to
lookup_symbol too much, either.

Anyways, thanks for the feedback, I appreciate it.

David Carlton
carlton@math.stanford.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]