This is the mail archive of the
gdb@sourceware.cygnus.com
mailing list for the GDB project.
Re: problems with gdb
- To: blizzard at mozilla dot org
- Subject: Re: problems with gdb
- From: Mark Kettenis <kettenis at wins dot uva dot nl>
- Date: Sat, 12 Feb 2000 18:59:22 +0100
- CC: gdb at sourceware dot cygnus dot com
- References: <38A47E89.3F4674B3@mozilla.org>
Date: Fri, 11 Feb 2000 16:26:33 -0500
From: Chris Blizzard <blizzard@mozilla.org>
Hi, folks. I've been talking about some problems that I've been
suffering through with gdb and mozilla which people on this mailing
list may or may not be aware of. Jason Molenda suggested that I
start flushing these out in the open to get some feedback on them.
I'm interested in getting my hands dirty and try to get these
problems fixed. I'm not a debugger hacker though so I might end up
asking some silly questions. :)
No problem!
May I ask some questions first? What version of GDB are you using?
What version of GCC are you using?
Here's the blurb, slightly edited for content.
...My problems are mostly related to how well gdb scales to handle
large shared libraries and large numbers of shared libraries. At
last count, there were 111 .so files in mozilla, the largest of
which is about 27 meg with debugging symbols. If you don't use
"set auto-solib-add 0" in your .gdbinit file, gdb will easily grow
to over 200 meg in size when starting the debugger. Someone once
did some estimates and it seems to use 5 times the size of a .so
after loading a shared library to debug. A lot of times, gdb won't
be able to load some of the larger .so files. It just hangs.
Let me first say that Mozilla seems to stretch things to the limit.
The huge number of shared libraries that you guys are using have
already uncovered several bugs in the dynamic linker and the Mozilla
developers have uncovered more than a few bugs in the LinuxThreads
library. That's mostly because you have gone where no one's gone
before :-)
From a quick glance at the output of `ps' on my system when loading a
program that uses about 10 shared libraries it seems that
the GDB memory usage is aproximately equal to the size of the shraed
libraries on disk. I guess the "factor 5" estimate, is referring to
the space used for debugging symbols as compared to the actual
code-size of the shared library. So it seems that your biggest
problem is the size of your shared libraries and the amount of
debugging information that's generated (which is basically
proportional to the amount of code in the libraries). I think that
using C++ is in a large way responsible for the `code bloat'. Maybe
an intelligent use of C++ features (check for compiler switches like
-fno-rtti and use them if appropriate) can reduce the size of the
resulting code. Also playing around with the options that control the
way debugging information is generated might help.
In principle the large amounts of debugging info shouldn't be a
problem. GDB can simply mmap the relevant sections, such that only
the debugging info that's really needed is actually pages in. I don't
know how the BFD library (the part of GDB that is responsible for
reading the sections containing debugging info) and the code that that
actually interprets this information implements these things. There
might be room for improvement there. Of course if all pages
containing debugging info are touched, you lose :-(.
A lot of times, trying to use "step" to step into a c++ method that
happens to be part of the same class just skips as if you had used
"next." That means that any time you want to step into a method
you have to set a temporary breakpoint by name on the method and
then allow the breakpoint to get you into that method. Doing that
to step into a dozen or so classes gets a little tedious. This is
hard to reproduce and I'm trying to build a test case.
It is a known problem that GDB has problems with the debugging output
generated by recent GCC compilers. Help in resolving those problems
would certainly be appreciated, and a (small) test case is really
essential if you want to get somebody else to look into it.
Compiling without optimization might circumvent these problems.
There are other much needed features, like not being able to
preload a .so and setting a breakpoint in the library before it
loads. Mozilla is entirely component based and this makes
debugging very, very difficult. I usually break on _dl_open in
glibc and wait until my library gets loaded before trying to set
the breakpoint that I need. That gets pretty bad after 27
libraries are loaded.
I think that the way GDB looks up symbols is differs from the way
the dynamic linker does that. That means that overriding symbols in
shared libraries probably doesn't work properly. Since the primary
use of preloaded shared libraries is overriding symbols you're likely
to experience problems. I don't think this problem is easy to solve.
Setting breakpoints in not-yet loaded shared libraries should not be
difficult to implement. Just make sure they start out as
`shlib_disabled' (see breakpoint.h) if the symbol cannot be found. It
is necessary to introduce a new command to do this (suggested name
`shlib-break' or `solib-break'). Reusing the guts of the ordinary
breakpoint setting command should be possible.
There are also various problems with threads. A lot of times gdb
won't exit after the last thread exits because it keeps trying to
kill a process which doesn't exist any more.
Probably caused by the strange way threads interact with signals on
Linux. It's very likely that the real bug is in the LinuxThreads
library and not in GDB. A lot of LinuxThreads problems have recently
been solved. You might want to try the latest glibc 2.1.3
pre-release. Or have a little patience, glibc 2.1.3 is supposed to be
released real soon now.
Hope this helps, and I'm looking forward to your contributions :-),
Mark