This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: suggestion for dictionary representation


On this topic, see the following thread. DanielJ and I had the same discussion
then.

http://sources.redhat.com/ml/gdb-patches/2001-11/msg00130.html

Elena


Jim Blandy writes:
 > 
 > David Carlton <carlton@math.stanford.edu> writes:
 > > > I'm tempted to whack the block special case for function arguments.  It
 > > > may make name lookup a little more complicated but I think it will make
 > > > everything clearer.  We could, of course, try this on the branch and
 > > > see if we like the results :)
 > > 
 > > Would it be reasonable to break up function blocks into two separate
 > > blocks: a linear block that only defines the parameters for the
 > > function and a non-linear block that contains the actual local
 > > variables?  Not that I think Jim's scheme is a bad one - I agree that
 > > it's better than the current scheme - but given the possibility of
 > > local variables shadowing function parameters, it seems to me to be
 > > conceptually cleaner to have two separate blocks appear anyways, and
 > > it also solves this problem.
 > 
 > The issue is a bit more tangled than you think, I think.  Splitting
 > the function's body and its formals into two separate blocks is a good
 > idea, but it isn't going to get rid of all your duplicates.  A single
 > formal parameter can have two symbols in a function's block that
 > describe it.  Try this out on a Pentium.  (The `-O2' and `-gstabs+'
 > are required.)
 > 
 >   $ cat func.c
 >   #include <stdio.h>
 > 
 >   int
 >   main (int argc, char **argv)
 >   {
 >     static int local = 3;
 >     printf ("%d\n", argc * local);
 >   }
 >   $ gcc -O2 -gstabs+ func.c -o func
 > 
 > Then start up GDB on GDB on `func':
 > 
 >   (top-gdb) run
 >   The program being debugged has been started already.
 >   Start it from the beginning? (y or n) y
 > 
 >   Starting program: gdb -nw func
 >   GNU gdb 2002-09-16-cvs
 >   Copyright 2002 Free Software Foundation, Inc.
 >   GDB is free software, covered by the GNU General Public License, and you are
 >   welcome to change it and/or distribute copies of it under certain conditions.
 >   Type "show copying" to see the conditions.
 >   There is absolutely no warranty for GDB.  Type "show warranty" for details.
 >   This GDB was configured as "i686-pc-linux-gnu"...
 >   (gdb)
 > 
 > Set a breakpoint in main, just to get the symbols read:
 > 
 >   (gdb) break main
 >   Breakpoint 1 at 0x804834c: file func.c, line 7.
 >   (gdb)
 > 
 > Drop out to the enclosing GDB:
 > 
 >   (gdb) info
 >   (top-gdb)
 > 
 > It just so happens that `func.c' is the first compilation unit of the
 > first executable file in GDB's list:
 > 
 >   (top-gdb) print object_files->symtabs->filename
 >   $177 = 0x82fdbf8 "func.c"
 >   (top-gdb)
 > 
 > If that's not so for you, you'll need to walk `symtabs' to find the
 > right symtab.  Anyway, let's check out this symtab's blockvector.  I'm
 > just using [0] as a postfix dereferencing operator here:
 > 
 >   (top-gdb) print object_files->symtabs->blockvector[0]
 >   $178 = {nblocks = 3, block = {0x82f8b74}}
 >   (top-gdb)
 > 
 > The first and second blocks are the global and static blocks, so the
 > third one is probably for `main':
 > 
 >   (top-gdb) print object_files->symtabs->blockvector->block[2]
 >   $179 = (struct block *) 0x82f8ab4
 >   (top-gdb) p *$179
 >   $180 = {startaddr = 134513472, endaddr = 134513513, function = 0x82f8988, 
 >     superblock = 0x82f8ae4, gcc_compile_flag = 2 '\002', hashtable = 0 '\0', 
 >     nsyms = 4, sym = {0x82f89c4}}
 >   (top-gdb) p *$179->function
 >   $181 = {ginfo = {name = 0x82f89bc "main", value = {ivalue = 137333428, 
 >         block = 0x82f8ab4, 
 >         bytes = 0x82f8ab4 "@\203\004\bi\203\004\b\210\211/\bä\212/\b\002", 
 >         address = 137333428, chain = 0x82f8ab4}, language_specific = {
 >         cplus_specific = {demangled_name = 0x0}}, language = language_c, 
 >       section = 11, bfd_section = 0x82d4fc0}, type = 0x82faaa8, 
 >     namespace = VAR_NAMESPACE, aclass = LOC_BLOCK, line = 5, aux_value = {
 >       basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
 >   (top-gdb)
 > 
 > And it was!  Let's look at those four symbols:
 > 
 >   (top-gdb) p *$179->sym[0]
 >   $182 = {ginfo = {name = 0x82f89f8 "argc", value = {ivalue = 8, block = 0x8, 
 >         bytes = 0x8 <Address 0x8 out of bounds>, address = 8, chain = 0x8}, 
 >       language_specific = {cplus_specific = {demangled_name = 0x0}}, 
 >       language = language_c, section = 0, bfd_section = 0x0}, type = 0x82df828,
 >     namespace = VAR_NAMESPACE, aclass = LOC_ARG, line = 4, aux_value = {
 >       basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
 >   (top-gdb) p *$179->sym[1]
 >   $183 = {ginfo = {name = 0x82f8a34 "argv", value = {ivalue = 12, block = 0xc, 
 >         bytes = 0xc <Address 0xc out of bounds>, address = 12, chain = 0xc}, 
 >       language_specific = {cplus_specific = {demangled_name = 0x0}}, 
 >       language = language_c, section = 0, bfd_section = 0x0}, type = 0x82faaf4,
 >     namespace = VAR_NAMESPACE, aclass = LOC_ARG, line = 4, aux_value = {
 >       basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
 >   (top-gdb) p *$179->sym[2]
 >   $184 = {ginfo = {name = 0x82f8a70 "argc", value = {ivalue = 0, block = 0x0, 
 >         bytes = 0x0, address = 0, chain = 0x0}, language_specific = {
 >         cplus_specific = {demangled_name = 0x0}}, language = language_c, 
 >       section = 0, bfd_section = 0x0}, type = 0x82df828, 
 >     namespace = VAR_NAMESPACE, aclass = LOC_REGISTER, line = 4, aux_value = {
 >       basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
 >   (top-gdb) p *$179->sym[3]
 >   $185 = {ginfo = {name = 0x82f8aac "local", value = {ivalue = 134517720, 
 >         block = 0x80493d8, bytes = 0x80493d8 "É\f", address = 134517720, 
 >         chain = 0x80493d8}, language_specific = {cplus_specific = {
 >           demangled_name = 0x0}}, language = language_c, section = 14, 
 >       bfd_section = 0x0}, type = 0x82df828, namespace = VAR_NAMESPACE, 
 >     aclass = LOC_STATIC, line = 6, aux_value = {basereg = 0}, aliases = 0x0, 
 >     ranges = 0x0, hash_next = 0x0}
 >   (top-gdb) 
 > 
 > Hey!  Why are there two entries for argc?  (This is the extra tangle I
 > was referring to.  If you know all about this, you can stop reading
 > now.)
 > 
 > The two `argc' symbols have different address classes: one has an
 > address class that indicates it's an argument, and the other doesn't.
 > The argument symbol describes where the variable is passed on the
 > stack (eight bytes after %ebp), whereas the non-argument symbol
 > describes where the variable lives in the block of the function:
 > register zero, or %eax.
 > 
 > As a sanity check, let's look at the IA-32 code for main:
 > 
 >     (top-gdb) c
 >     Continuing.
 >     (gdb) disass main
 >     Dump of assembler code for function main:
 >     0x8048340 <main>:	push   %ebp
 >     0x8048341 <main+1>:	mov    %esp,%ebp
 >     0x8048343 <main+3>:	sub    $0x8,%esp
 >     0x8048346 <main+6>:	mov    0x8(%ebp),%eax
 >     0x8048349 <main+9>:	and    $0xfffffff0,%esp
 >     0x804834c <main+12>:	mov    0x80493d8,%edx
 >     0x8048352 <main+18>:	movl   $0x80483c8,(%esp,1)
 >     0x8048359 <main+25>:	imul   %edx,%eax
 >     0x804835c <main+28>:	mov    %eax,0x4(%esp,1)
 >     0x8048360 <main+32>:	call   0x8048268 <printf>
 >     0x8048365 <main+37>:	mov    %ebp,%esp
 >     0x8048367 <main+39>:	pop    %ebp
 >     0x8048368 <main+40>:	ret    
 >     End of assembler dump.
 >     (gdb) 
 > 
 > So, yes, the compiler did copy `argc' from the stack into %eax.
 > Check.
 > 
 > But *why* does GDB do this?  I have no idea.  It seems to me that,
 > with prologue skipping et al, simply having a single LOC_REGPARM would
 > be the Right Thing.  I don't really know when GDB will prefer the
 > argument entry, and when it'll prefer the non-argument entry.
 > 
 > I suspect it's historical.  If you look at the stabs spec, you'll see
 > that it actually emits two stabs for arguments that are passed in one
 > place, but get moved somewhere else:
 > 
 >   $ objdump --stabs func
 >   ...
 >   329    FUN    0      5      08048340 12145  main:F(0,1)
 >   330    PSYM   0      4      00000008 12157  argc:p(0,1)
 >   331    PSYM   0      4      0000000c 12169  argv:p(1,1)=*(7,36)
 >   332    SLINE  0      5      00000000 0      
 >   333    SLINE  0      7      0000000c 0      
 >   334    SLINE  0      8      00000025 0      
 >   335    RSYM   0      4      00000000 12189  argc:r(0,1)
 >   336    STSYM  0      6      080493d8 12201  local:V(0,1)
 >   337    LBRAC  0      0      0000000c 0      
 >   338    RBRAC  0      0      00000029 0      
 >   339    FUN    0      0      00000029 0      
 >   ...
 >   $ 
 > 
 > The PSYM accounts for the argument symbol, and the RSYM accounts for
 > the internal symbol.  A lot of GDB's data structures very closely
 > match what's provided in STABS.  (The partial symbol tables are a
 > good example of this: they correspond exactly to the EXCL links.)
 > 
 > But anyway, all this could be handled much better nowadays using Dwarf
 > 2 CFA and location lists.  I've been saying that for years, but it
 > hasn't happened yet.  Andrew has the CFI done now (I think?), and
 > Daniel B. has submitted a patch for location expressions (but not
 > location lists, tho they would be easy to add), but it's awaiting
 > revision while he works on law school.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]