This is the mail archive of the
gdb@sources.redhat.com
mailing list for the GDB project.
Re: suggestion for dictionary representation
On this topic, see the following thread. DanielJ and I had the same discussion
then.
http://sources.redhat.com/ml/gdb-patches/2001-11/msg00130.html
Elena
Jim Blandy writes:
>
> David Carlton <carlton@math.stanford.edu> writes:
> > > I'm tempted to whack the block special case for function arguments. It
> > > may make name lookup a little more complicated but I think it will make
> > > everything clearer. We could, of course, try this on the branch and
> > > see if we like the results :)
> >
> > Would it be reasonable to break up function blocks into two separate
> > blocks: a linear block that only defines the parameters for the
> > function and a non-linear block that contains the actual local
> > variables? Not that I think Jim's scheme is a bad one - I agree that
> > it's better than the current scheme - but given the possibility of
> > local variables shadowing function parameters, it seems to me to be
> > conceptually cleaner to have two separate blocks appear anyways, and
> > it also solves this problem.
>
> The issue is a bit more tangled than you think, I think. Splitting
> the function's body and its formals into two separate blocks is a good
> idea, but it isn't going to get rid of all your duplicates. A single
> formal parameter can have two symbols in a function's block that
> describe it. Try this out on a Pentium. (The `-O2' and `-gstabs+'
> are required.)
>
> $ cat func.c
> #include <stdio.h>
>
> int
> main (int argc, char **argv)
> {
> static int local = 3;
> printf ("%d\n", argc * local);
> }
> $ gcc -O2 -gstabs+ func.c -o func
>
> Then start up GDB on GDB on `func':
>
> (top-gdb) run
> The program being debugged has been started already.
> Start it from the beginning? (y or n) y
>
> Starting program: gdb -nw func
> GNU gdb 2002-09-16-cvs
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "i686-pc-linux-gnu"...
> (gdb)
>
> Set a breakpoint in main, just to get the symbols read:
>
> (gdb) break main
> Breakpoint 1 at 0x804834c: file func.c, line 7.
> (gdb)
>
> Drop out to the enclosing GDB:
>
> (gdb) info
> (top-gdb)
>
> It just so happens that `func.c' is the first compilation unit of the
> first executable file in GDB's list:
>
> (top-gdb) print object_files->symtabs->filename
> $177 = 0x82fdbf8 "func.c"
> (top-gdb)
>
> If that's not so for you, you'll need to walk `symtabs' to find the
> right symtab. Anyway, let's check out this symtab's blockvector. I'm
> just using [0] as a postfix dereferencing operator here:
>
> (top-gdb) print object_files->symtabs->blockvector[0]
> $178 = {nblocks = 3, block = {0x82f8b74}}
> (top-gdb)
>
> The first and second blocks are the global and static blocks, so the
> third one is probably for `main':
>
> (top-gdb) print object_files->symtabs->blockvector->block[2]
> $179 = (struct block *) 0x82f8ab4
> (top-gdb) p *$179
> $180 = {startaddr = 134513472, endaddr = 134513513, function = 0x82f8988,
> superblock = 0x82f8ae4, gcc_compile_flag = 2 '\002', hashtable = 0 '\0',
> nsyms = 4, sym = {0x82f89c4}}
> (top-gdb) p *$179->function
> $181 = {ginfo = {name = 0x82f89bc "main", value = {ivalue = 137333428,
> block = 0x82f8ab4,
> bytes = 0x82f8ab4 "@\203\004\bi\203\004\b\210\211/\bä\212/\b\002",
> address = 137333428, chain = 0x82f8ab4}, language_specific = {
> cplus_specific = {demangled_name = 0x0}}, language = language_c,
> section = 11, bfd_section = 0x82d4fc0}, type = 0x82faaa8,
> namespace = VAR_NAMESPACE, aclass = LOC_BLOCK, line = 5, aux_value = {
> basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
> (top-gdb)
>
> And it was! Let's look at those four symbols:
>
> (top-gdb) p *$179->sym[0]
> $182 = {ginfo = {name = 0x82f89f8 "argc", value = {ivalue = 8, block = 0x8,
> bytes = 0x8 <Address 0x8 out of bounds>, address = 8, chain = 0x8},
> language_specific = {cplus_specific = {demangled_name = 0x0}},
> language = language_c, section = 0, bfd_section = 0x0}, type = 0x82df828,
> namespace = VAR_NAMESPACE, aclass = LOC_ARG, line = 4, aux_value = {
> basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
> (top-gdb) p *$179->sym[1]
> $183 = {ginfo = {name = 0x82f8a34 "argv", value = {ivalue = 12, block = 0xc,
> bytes = 0xc <Address 0xc out of bounds>, address = 12, chain = 0xc},
> language_specific = {cplus_specific = {demangled_name = 0x0}},
> language = language_c, section = 0, bfd_section = 0x0}, type = 0x82faaf4,
> namespace = VAR_NAMESPACE, aclass = LOC_ARG, line = 4, aux_value = {
> basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
> (top-gdb) p *$179->sym[2]
> $184 = {ginfo = {name = 0x82f8a70 "argc", value = {ivalue = 0, block = 0x0,
> bytes = 0x0, address = 0, chain = 0x0}, language_specific = {
> cplus_specific = {demangled_name = 0x0}}, language = language_c,
> section = 0, bfd_section = 0x0}, type = 0x82df828,
> namespace = VAR_NAMESPACE, aclass = LOC_REGISTER, line = 4, aux_value = {
> basereg = 0}, aliases = 0x0, ranges = 0x0, hash_next = 0x0}
> (top-gdb) p *$179->sym[3]
> $185 = {ginfo = {name = 0x82f8aac "local", value = {ivalue = 134517720,
> block = 0x80493d8, bytes = 0x80493d8 "É\f", address = 134517720,
> chain = 0x80493d8}, language_specific = {cplus_specific = {
> demangled_name = 0x0}}, language = language_c, section = 14,
> bfd_section = 0x0}, type = 0x82df828, namespace = VAR_NAMESPACE,
> aclass = LOC_STATIC, line = 6, aux_value = {basereg = 0}, aliases = 0x0,
> ranges = 0x0, hash_next = 0x0}
> (top-gdb)
>
> Hey! Why are there two entries for argc? (This is the extra tangle I
> was referring to. If you know all about this, you can stop reading
> now.)
>
> The two `argc' symbols have different address classes: one has an
> address class that indicates it's an argument, and the other doesn't.
> The argument symbol describes where the variable is passed on the
> stack (eight bytes after %ebp), whereas the non-argument symbol
> describes where the variable lives in the block of the function:
> register zero, or %eax.
>
> As a sanity check, let's look at the IA-32 code for main:
>
> (top-gdb) c
> Continuing.
> (gdb) disass main
> Dump of assembler code for function main:
> 0x8048340 <main>: push %ebp
> 0x8048341 <main+1>: mov %esp,%ebp
> 0x8048343 <main+3>: sub $0x8,%esp
> 0x8048346 <main+6>: mov 0x8(%ebp),%eax
> 0x8048349 <main+9>: and $0xfffffff0,%esp
> 0x804834c <main+12>: mov 0x80493d8,%edx
> 0x8048352 <main+18>: movl $0x80483c8,(%esp,1)
> 0x8048359 <main+25>: imul %edx,%eax
> 0x804835c <main+28>: mov %eax,0x4(%esp,1)
> 0x8048360 <main+32>: call 0x8048268 <printf>
> 0x8048365 <main+37>: mov %ebp,%esp
> 0x8048367 <main+39>: pop %ebp
> 0x8048368 <main+40>: ret
> End of assembler dump.
> (gdb)
>
> So, yes, the compiler did copy `argc' from the stack into %eax.
> Check.
>
> But *why* does GDB do this? I have no idea. It seems to me that,
> with prologue skipping et al, simply having a single LOC_REGPARM would
> be the Right Thing. I don't really know when GDB will prefer the
> argument entry, and when it'll prefer the non-argument entry.
>
> I suspect it's historical. If you look at the stabs spec, you'll see
> that it actually emits two stabs for arguments that are passed in one
> place, but get moved somewhere else:
>
> $ objdump --stabs func
> ...
> 329 FUN 0 5 08048340 12145 main:F(0,1)
> 330 PSYM 0 4 00000008 12157 argc:p(0,1)
> 331 PSYM 0 4 0000000c 12169 argv:p(1,1)=*(7,36)
> 332 SLINE 0 5 00000000 0
> 333 SLINE 0 7 0000000c 0
> 334 SLINE 0 8 00000025 0
> 335 RSYM 0 4 00000000 12189 argc:r(0,1)
> 336 STSYM 0 6 080493d8 12201 local:V(0,1)
> 337 LBRAC 0 0 0000000c 0
> 338 RBRAC 0 0 00000029 0
> 339 FUN 0 0 00000029 0
> ...
> $
>
> The PSYM accounts for the argument symbol, and the RSYM accounts for
> the internal symbol. A lot of GDB's data structures very closely
> match what's provided in STABS. (The partial symbol tables are a
> good example of this: they correspond exactly to the EXCL links.)
>
> But anyway, all this could be handled much better nowadays using Dwarf
> 2 CFA and location lists. I've been saying that for years, but it
> hasn't happened yet. Andrew has the CFI done now (I think?), and
> Daniel B. has submitted a patch for location expressions (but not
> location lists, tho they would be easy to add), but it's awaiting
> revision while he works on law school.