This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Multi-threaded dwarf parsing


On 2016-02-24 10:30, Tom Tromey wrote:
It's been a while since I thought about that branch.

I think it helps some scenarios, but maybe not as many as you'd like.
In fact, I think it doesn't help the two of the three most typical ways
I debug Firefox. (I realize this may not apply directly to your idea of
reading each CU independently; this is just the state of that branch.)

1. Run Firefox, then attach.

   Here it is pretty normal for the attach to interrupt Firefox
somewhere in libxul.so -- the largest library (so much larger that it
   is the only one that causes a noticeable pause at gdb startup).

   But, it seems to me that stopping somewhere in libxul.so should
   probably cause its debuginfo to be read.

2. Start gdb, set a breakpoint, then run Firefox.

   Here debuginfo for every library must be read in order to set the
   breakpoint correctly.


The third scenario, which would be helped, is:

3. Start gdb, run Firefox, and try to reproduce a crash.  In this
   situation gdb could read the debuginfo in the background and
   everything would work nicely.


That said, I think my branch might have helped a tiny bit with scenario
#1, because it prioritized the largest files when reading debuginfo.
So, libxul.so would generally be read a bit earlier than it is now.

Reading each CU independently seems like a good idea to me.  I think it
will stumble into various problems inside gdb, but I'd guess they are
all surmountable with enough work.

Indeed, we probably had different, but not incompatible ideas of "threaded". Just to make sure I understand correctly: instead of blocking on the psymtabs creation at startup (in elf_symfile_read), you offload that to worker threads and carry on. If you happen to need the information and it's not ready yet, then the main code will have to block until the corresponding task is complete (dwarf2_require_psymtabs). However, in each worker thread, each objfile is still processed sequentially. So if you are waiting for libxul.so's debug info to be ready (such as in #1), it won't be ready any faster. Is that right?

My view of the parallelism was that when reading an objfile's debug info, the
main thread would offload chunks of work (a chunk == a CU) to the worker
threads, but wait for all of them to be done before continuing. So it would still be blocking on the psymtab creation, but it would block for a shorter time (divided by the number of threads/cores, in an ideal world). It's just
replacing a serial algorithm by a parallel one, but it would be mostly
transparent to the rest of gdb.

I hadn't thought of reading the info in the background, but I like the fact that it can get the user to a prompt faster. And I think these two forms of parallelism are not mutually exclusive, we could very well read CUs in parallel,
in the background.

I think this could help with scenario #1.  The ideal situation here
would be to read just the CU (or CUs?) covering the stop address; then
lazily read more as needed for types and such.

I suppose it could also help #2 if enough parallelism is there to be
had, though I'm a bit skeptical.

I think that reading CUs in parallel would help pretty much any use case where you are waiting for psymtabs to be created, it could reduce that wait time.

So, in a word, are there any gotchas or good reasons not do take this
path?

Pedro> The obvious gotchas are of course all the globals, and coming up with Pedro> fine enough locking granularity that threads actually do run in parallel.

I think the gotcha situation got worse since I wrote my patch.

Now the DWARF reader can call into the type-printing system, which it
didn't before.  It wasn't clear to me that this was safe.  ISTR there
was some other change along these lines -- the DWARF reader calling out
to some gdb module that it previously did not -- but I can't remember
what it was any more.

The DWARF reader also has many more modes (debug_types, dwz, dwo/dwp)
than it did back then.  So, this will require some careful auditing.

Yes, I'm sure the reality is way more complicated than the image I have
in my head at the moment :).

FWIW my threading patches were written during my time at Red Hat and so
you can use any part of that series without needing any paperwork from
me.

Great, thanks!

Simon


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]