This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: Multi-threaded dwarf parsing
- From: Simon Marchi <simon dot marchi at polymtl dot ca>
- To: Tom Tromey <tom at tromey dot com>
- Cc: Pedro Alves <palves at redhat dot com>, gdb at sourceware dot org
- Date: Wed, 24 Feb 2016 11:43:03 -0500
- Subject: Re: Multi-threaded dwarf parsing
- Authentication-results: sourceware.org; auth=none
- References: <2c38d5c574de28faa9fc94fe4ed17d45 at simark dot ca> <56CD8EC0 dot 3010304 at redhat dot com> <87lh6a6s8s dot fsf at tromey dot com>
On 2016-02-24 10:30, Tom Tromey wrote:
It's been a while since I thought about that branch.
I think it helps some scenarios, but maybe not as many as you'd like.
In fact, I think it doesn't help the two of the three most typical ways
I debug Firefox. (I realize this may not apply directly to your idea
of
reading each CU independently; this is just the state of that branch.)
1. Run Firefox, then attach.
Here it is pretty normal for the attach to interrupt Firefox
somewhere in libxul.so -- the largest library (so much larger that
it
is the only one that causes a noticeable pause at gdb startup).
But, it seems to me that stopping somewhere in libxul.so should
probably cause its debuginfo to be read.
2. Start gdb, set a breakpoint, then run Firefox.
Here debuginfo for every library must be read in order to set the
breakpoint correctly.
The third scenario, which would be helped, is:
3. Start gdb, run Firefox, and try to reproduce a crash. In this
situation gdb could read the debuginfo in the background and
everything would work nicely.
That said, I think my branch might have helped a tiny bit with scenario
#1, because it prioritized the largest files when reading debuginfo.
So, libxul.so would generally be read a bit earlier than it is now.
Reading each CU independently seems like a good idea to me. I think it
will stumble into various problems inside gdb, but I'd guess they are
all surmountable with enough work.
Indeed, we probably had different, but not incompatible ideas of
"threaded".
Just to make sure I understand correctly: instead of blocking on the
psymtabs
creation at startup (in elf_symfile_read), you offload that to worker
threads
and carry on. If you happen to need the information and it's not ready
yet,
then the main code will have to block until the corresponding task is
complete
(dwarf2_require_psymtabs). However, in each worker thread, each objfile
is
still processed sequentially. So if you are waiting for libxul.so's
debug info
to be ready (such as in #1), it won't be ready any faster. Is that
right?
My view of the parallelism was that when reading an objfile's debug
info, the
main thread would offload chunks of work (a chunk == a CU) to the worker
threads, but wait for all of them to be done before continuing. So it
would
still be blocking on the psymtab creation, but it would block for a
shorter
time (divided by the number of threads/cores, in an ideal world). It's
just
replacing a serial algorithm by a parallel one, but it would be mostly
transparent to the rest of gdb.
I hadn't thought of reading the info in the background, but I like the
fact
that it can get the user to a prompt faster. And I think these two
forms of
parallelism are not mutually exclusive, we could very well read CUs in
parallel,
in the background.
I think this could help with scenario #1. The ideal situation here
would be to read just the CU (or CUs?) covering the stop address; then
lazily read more as needed for types and such.
I suppose it could also help #2 if enough parallelism is there to be
had, though I'm a bit skeptical.
I think that reading CUs in parallel would help pretty much any use case
where
you are waiting for psymtabs to be created, it could reduce that wait
time.
So, in a word, are there any gotchas or good reasons not do take this
path?
Pedro> The obvious gotchas are of course all the globals, and coming up
with
Pedro> fine enough locking granularity that threads actually do run in
parallel.
I think the gotcha situation got worse since I wrote my patch.
Now the DWARF reader can call into the type-printing system, which it
didn't before. It wasn't clear to me that this was safe. ISTR there
was some other change along these lines -- the DWARF reader calling out
to some gdb module that it previously did not -- but I can't remember
what it was any more.
The DWARF reader also has many more modes (debug_types, dwz, dwo/dwp)
than it did back then. So, this will require some careful auditing.
Yes, I'm sure the reality is way more complicated than the image I have
in my head at the moment :).
FWIW my threading patches were written during my time at Red Hat and so
you can use any part of that series without needing any paperwork from
me.
Great, thanks!
Simon