This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
[rfc] C++ namespaces

From: David Carlton <carlton at math dot stanford dot edu>
To: gdb <gdb at sources dot redhat dot com>
Cc: Elena Zannoni <ezannoni at redhat dot com>, Daniel Jacobowitz <drow at mvista dot com>, Jim Blandy <jimb at redhat dot com>
Date: 18 Feb 2003 14:10:03 -0800
Subject: [rfc] C++ namespaces
I'm ready to start merging large chunks of my C++ namespace work into
mainline GDB.  The code is, in my opinion, useable now (I've been
using it when debugging C++ code for a couple of months); there are
things that still should be added, which I'll outline at the end of
this message, but it's an appropriate time to start the merging
process.


For those of you who aren't familiar with C++, here are the basics of
namespaces.  They're either a weak version of classes or a fancy
version of global scope, depending on your point of view.  An
expression of the form

namespace X {
  class XClass {
   ...
  };

  int Xvar;

  void Xfunc(int x)
  {
    ...
  }
}

defines new members XClass, Xvar, and Xfunc of the namespace X.  This
makes your code different than it would be if XClass, Xvar, and Xfunc
were defined as regular globals in two different ways:

1) From within Xfunc (or other functions within namespace X), the name
   lookup rules are different, in that names are searched for within X
   before they're searched for in the global namespace.  (This is just
   like how, in a member function of a class, you first try to resolve
   names by seeing if they match other member functions, before
   looking to see if they match global variables.)  This is called
   'namespace scope'.

2) If you're not within namespace X, you can't just use 'XClass' to
   refer to XClass, you have to type 'X::XClass'.  (Thus, referring to
   a member of a namespace is like referring to a static member of a
   class.)

Those are the main issues associated with namespaces, though there are
others (anonymous namespaces, using directives and declarations,
Koening lookup, ...).  It's impossible to avoid namespaces: the entire
C++ standard library lives within namespace 'std'.


The above suggests that we have to modify GDB in two separate ways:

1) We have to teach GDB about namespace scope.  In every place where
   GDB searches global variables and static variables, we should
   consider searching through appropriate namespaces.  (This means
   that we need to know what namespace is in scope at every block!)

2) We have to teach GDB about the :: operator.

Right now, GDB stores variables in namespaces as symbols whose name
includes '::': so 'Xfunc' above would be stored as a symbol whose name
is 'X::Xfunc'.  This is a partial solution to problem 2, but isn't a
solution to problem 1.  I preserve this way of naming symbols in
namespaces, but it's more of an implementation detail: I try to handle
:: directly and correctly to the extent that is possible.


One issue that impacts this is that no released version of GCC
generates debugging information related to namespaces.  This means
that we have to be prepared to deduce namespace information from other
sources (e.g. mangled names) if we don't have as much debugging
information as we'd like.  Fortunately, most key uses of namespaces
leave enough footprints around that we can get away with this pretty
well; it's not perfect, though.  I've been testing my work both on a
released version of GCC (usually 3.1) and on a version that has been
modified to output some namespace info (though not as much as I'd
like).  I believe that GCC 3.3 will include some namespace debugging
info and GCC 3.4 will include more; I might be wrong about that.


Here is a rough guide to patches to come:

* Annotate blocks.  For each block (or at least for each function), we
  need to know what namespace is in scope.  We might need to deduce
  this from linkage names if we don't have enough debug info;
  fortunately, that's easy.  There's more information that we'd like
  to add to blocks (using directives/declarations, for example); I've
  added some of that extra info, but not all of it, since I don't have
  access to a version of GCC that supports debug info for all of it.

* Teach lookup_symbol about namespace scope.  We need to replace the
  part of lookup_symbol that looks for static and global variables
  with code to search through the namespaces that are in scope.  (This
  is why I've been cleaning up lookup_symbol: I needed to be able to
  understand the code that searches through static and global
  variables in the first place!)  Also, add a separate function
  lookup_symbol_namespace that searches for a name in a specific
  namespace.

* Generate symbols associated to namespaces.  The reason that we need
  this is that, to correctly look up 'A::x', we first need to look up
  'A', then, if it's a namespace, look up 'x' inside that namespace.
  (Otherwise, we'll get the semantics subtly wrong.)  Here we again
  run into a problem if we don't have enough debug info; we can try to
  deduce the existence of namespaces from linkage names, though there
  are cases where we won't have enough info.  I've done this for
  DWARF-2, though not yet for stabs.

* Actually implement the :: operator.  Here there is a problem that
  the parser is a mess, that parsing C++ correctly is inherently a
  difficult problem, and that we use one parser for a few different
  contexts (either to parse an expression or to parse a type).  I have
  some hacks to deal with this; I'm not completely happy with them.

* Teach symbol lookup functions other than lookup_symbol about
  namespaces.  I've taught the function overloading stuff and
  decode_line_1 about namespaces.  These run into different,
  interesting problems: the parser and function overloading stuff are
  currently structured in such a way as to make it impossible to get
  enough information for GDB to implement C++'s function call
  semantics, and decode_line_1 already has some HP hack to deal with
  '::'.  (Which is why you can type "break X::func" but have to type
  "print 'X::var'" (note the single quotes) currently.)

* Make sure that all symbols associated to objects in namespaces
  actually have their names set correctly.  Currently, we get classes
  and other types wrong.  With correct debug info, we'll be able to
  get all of that right.  With incomplete debug info, we can get
  classes right but not other types.  I have code to handle this in
  the incomplete debug info in the DWARF-2 case (though I haven't yet
  gotten around to handling stabs); I waver as to how much I like it.
  A user did recently send me a private e-mail asking about the status
  of this (or he might have been asking about nested classes, which is
  a similar problem), which suggests that we should try to get it
  right.

* Make the test suite accurately reflect what we get right and what we
  get wrong.  (This should be handled in parallel with the above
  steps.)


That's what I've got implemented on a branch.  Some things that I
haven't implemented yet:

* Support for deducing namespace info from linkage names within
  stabs.  This should be easy; the main reason why I haven't done it
  is that I've studiously avoided learning anything about stabs.

* The parser has minor problems.  Right now, I have some hacks that
  mostly get lookup correct but sometimes give weird error messages
  when lookup corretly fails.  (I.e. if a user refers to a variable
  that doesn't exist, the error message I print is sometimes not at
  all helpful.)  I know how to fix this: I had it working better
  before, it just introduced more reduce/reduce conflicts, and I had
  it drilled into me at a young age that those are a Bad Thing.  But
  if you already have 20 reduce/reduce conflicts, what's one or two
  more?

* The parser has major problems.  There's a lot about the parser that
  has to be rethought.  (Possibly in ways that would affect languages
  other than C and C++.)  The division of responsibility between the
  lexer, the parser, and the evaluator is bizarre, leading to
  incorrect parses and situations where code doesn't have information
  that it needs.

* I haven't implemented namespace aliases, even though I have access
  to a compiler that generates debug info for them.  This should be
  easy.

* I haven't implemented using directives and using declarations.
  (Other than using directives for anonymous namespaces.)  I don't
  have access to a compiler that generates debug info for them (and
  there's no way to fake it without debug info); once I do,
  implementing it should be easy.

* I recently noticed that breakpoints aren't always getting reset
  properly after a file gets reloaded.  I have to look into what the
  code in question is doing; my guess is that it should be possible to
  handle this easily enough once I've cleaned up the interface to
  symbols' linkage names and source code names.


I'll send out the first patch in a day or two.  Any suggestions about
this undertaking would be gratefully appreciated.

David Carlton
carlton@math.stanford.edu
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]