This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: struct environment


On Fri, Sep 06, 2002 at 10:20:38AM -0700, David Carlton wrote:
> Daniel Jacobowitz <drow@mvista.com> writes:
> > On Thu, Sep 05, 2002 at 01:50:40PM -0700, David Carlton wrote:
> 
> >> So I want to add a type 'struct environment' to GDB.  It will be
> >> completely opaque, with a variety of internal implementations
> >> available (similar to the way that 'struct block' allows you to use
> >> either a hash table or a simple list of symbols) sharing a common
> >> external interface.  I want to start by converting 'struct block' over
> >> to using this, then converting the global environment over to using
> >> this, then (probably) 'struct type', and finally getting namespaces to
> >> work well.
> 
> > Why do you want to have multiple available implementations?  I think
> > the overhead for always hashing is small enough.  There are only two
> > reasons struct block allows hash tables or linked lists:
> >   - overloading of the meaning of that list to represent a list of
> >     function arguments, which is ordered
> >   - warts in mdebugread that I was not patient enough to overcome when
> >     I finally merged in hash table support
> 
> > I suppose the first reason is a legitimate one for multiple
> > implementations; we could mark an environment as 'ordered'.  Or we
> > could stop overloading the meaning of the list that way.  I don't know
> > which is better.
> 
> The global block in jv-lang.c may also pose similar problems to those
> in mdebugread.c, for what that's worth.  (I'm not sure: it might be
> cleaner than mdebugread.c.)  But you've basically put your finger on
> it: some blocks currently have a reason to be ordered, and some pieces
> of code try to construct the environments on the fly, growing them as
> necessary.  Which, incidentally, I don't think we should discourage:
> instead of requiring code to calculate the size of environments before
> building them, we might as well support environments that can be built
> incrementally and then finalized.  That way the common bookkeeping
> code gets moved into the environment stuff, so people don't have to
> keep on redoing it.

The question is whether those partial environments should be treated as
environments as all - do we even need to do lookups on them?  I believe
we never should.

> And there's another, even more important reason: the global
> environment (and, in the future, namespaces).  The global environment
> spans multiple files, uses lazy evaluation (i.e. partial_symbols), and
> there's even minimal_symbols to shoehorn in there somehow.  So it's
> going to need its own special implementation (which will be much more
> complicated than the implementations for blocks).

This is why I don't like the environment == list-of-symbols thing.  An
environment may HAVE a list of symbols, but it is not its list of
symbols.  You shouldn't grow the list of symbols in the global
environment when a new file is added.  Instead you should associate a
new list of symbols with it.  Files can be removed as well as added,
remember.  We don't do that very well right now.

Search the archives of this list for:
  Date: 11 Jun 2001 12:55:42 -0400
  From: Daniel Berlin <dan@cgsoftware.com>
  Subject: [RFC] Symbol table improvements

for some other suggested reading on this topic.  I like this
architecture he describes, even if we're not quite ready for it yet.
We could be.

> At some point in the future, it might be possible to use a single
> implementation for everything; it would have to support:
> 
> * Growing as needed
> * Fast lookups
> * Being able to retrieve all entries in the order in which they were
>   added
> * Whatever extra cruft is necessary to support environments like the
>   global environment.
> 
> This is certainly possible (heaps solve the first two problems, for
> example), but I think moving over to a single implementation can and
> should be postponed until much later in the game.  It'll be easy
> enough to do, if desirable, once existing code gets converted over to
> using environments.

I don't see the benefit of maintaining order-added information in the
general case; we want it only for specific small lists.

> > I'm not sure if pointers to symbol entries are saved in the process
> > of constructing the list, which would make that hard; hopefully not.
> 
> > There is no reason to be concerned about performance of the finalize
> > method in this case.  If someone is concerned they can fix the way
> > mdebugread generates symbol lists so that it uses the same hash tables
> > everyone else does.
> 
> >> * There needs to be a way to add symbols to an environment while
> >> constructing it.  Maybe
> >> 
> >> void env_add_symbol(struct environment *, struct symbol *);
> 
> > Does there?  Right now most lists are constructed first into a pending
> > list, and only then hashed into an environment.
> 
> I was thinking of env_add_symbol as the way to hash the symbols into
> the environment.  I'll look at how the pending lists work; if all the
> code uses the same data structure, then of course there should be a
> constructor that takes the entire pending list rather than requiring
> each symbol to be added individually.

Look at buildsym.c:add_symbol_to_list, find_symbol_in_list, and
finish_block.

> > I'd rather not have to use growable hash tables.
> 
> Indeed.  On the other hand, as I mentioned above, it seems plausible
> to me that the responsibility for managing the lists of pending
> symbols should, in the future, be tranferred to environments rather
> than the code that builds the blocks.  So it seems plausible to me to

It already is centralized in buildsym, for all but the crufty readers.

> have a linked list implementation of environments that is great for
> adding new symbols and lousy for lookups, and then have finalize
> functions that convert linked list implementations to either hash
> tables or arrays (depending on whether or not you needed ordering),
> which could no longer grow.

We've pretty much got this already.  That's finish_block; you just
described its purpose.

> >> * There should be a shallow lookup function, i.e. one that looks up a
> >> symbol just in one environment, without going up to a parent
> >> environment.  (The latter will get added in once we get to
> >> converting over the global environment.)  Maybe
> >> 
> >> struct symbol *env_lookup_shallow(struct environment *,
> >> const char *);
> 
> > You may need more information than this.  Sometimes
> > lookup_block_symbol needs to look up the symbol (i.e. demangled)
> > which has a particular mangled name - this is a wart and not one I'm
> > fond of but it isn't quite ready to die yet.
> 
> I think that I'm tentatively planning to defer issues like mangled
> vs. demangled names until I convert over the global environment: that
> will be the time to think about exactly what sorts of deep lookup
> functions will be necessary, for example.  Having said that, I just
> looked at lookup_block_symbol more closely, and I'm not sure that my
> planned iterators would be quite enough to handle it.  (But I might be
> able to delete the relevant code, see below.)

Unlikely.  The case I'm talking about is tied to some of our C++
evilness; there are multiple global symbols with the same demangled
name, and it is vital that we know how to breakpoint the correct one,
or the breakpoint will not be hit.

That said this needs cleanups elsewhere.  We might be able to handle it
some other way...

> Right, that's what I was thinking of: this could replace that easily
> enough as long as there weren't two copies of ALL_BLOCK_SYMBOLS
> running on the same block simultaneously, which seems plausible to
> me.

Probably.  And you could add assertions to ensure this.

> > There are a couple of places where I could not use it for some
> > reason, though.
> 
> >> * It's vaguely possible that some code might need to count the number
> >> of symbols in an environment, but I don't think so; presumably
> >> that's only used by iterators or by code that could be moved to the
> >> inside of struct environment.
> >> 
> >> * When scanning GDB for uses of 'struct block', I noticed that some
> >> code wants to sort symbols in the block.  I'll have to look at where
> >> and why that's happening, but I don't think it's anything serious.
> >> (When it's necessary, it can presumably be added to a 'finalize'
> >> function as in the discussion of constructors above.)
> 
> > It's irrelevant - it was only for speeding up searching.  Note that
> > hashed blocks are never sorted and blocks with a function are never
> > sorted.  mdebugread-style blocks are simply legacy.
> 
> Great.  That probably means that I can just skip the sorting and
> delete code in functions like lookup_block_symbols that handles sorted
> linear blocks as a special case.

Eventually, I hope so.  Yes.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]