This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] Disassembly improvements


Pedro Alves writes:
 > On 10/11/2013 10:34 PM, Doug Evans wrote:
 > 
 > > This is a specific fix to a general problem.
 > 
 > I don't know that this is a general problem.

The general problem I'm referring to is efficient access of target memory.
[Otherwise we wouldn't have things like the dcache,
trust-readonly, explicit caching support for stack requests, etc.]

 >  It may look like one,
 > but it's not super clear to me.  Yes, we might have a similar problem
 > caused by lots of tiny reads from the target during prologue analysis.
 > But the approach there might be different from the right approach for
 > disassembly, or we could also come to the conclusion the problem
 > there is not exactly the same.
 >
 > > Question: How much more of the general problem can we fix without
 > > having a fix baked into the disassembler?
 > 
 > The disassembly use case is one where GDB is being
 > told by the user "treat this range of addresses that I'll be
 > reading sequencially, as code".  If that happens to trip on some
 > memory mapped registers or some such, then it's garbage-in,
 > garbage-out, it was the user's fault.

Though if gdb doesn't provide a range to constrain the caching,
the caching doesn't come into play in the current version of the patch
(the patch still avoids trying to prefetch too much).
In the case of, e.g., "disas main" gdb does provide a range.
The patch makes "disas main" efficient but doesn't help "x/5i main".
[No claim is made that improving the latter case is necessarily as easy,
but I think there is a case to be made that this patch fixes
a specific case (disas) of a specific case (reading code memory
for disassembly) of a general problem (reading target memory) :-).]

Presumably gdb can use function bounds or something else from the
debug info to constrain the affected memory space for other requests
so those can be sped up too.

"b main" on amd64 is instructive.
The stack align machinery blindly fetches 18 bytes,
and then prologue skipping ignores that and fetches a piece at a time.
And we do that twice (once for main from dwarf, once for main from elf).

(gdb) b main
Sending packet: $m4007b4,12#5d...Packet received: 554889e5be1c094000bf40106000e8e1feff
Sending packet: $m4007b4,1#2b...Packet received: 55
Sending packet: $m4007b5,3#2e...Packet received: 4889e5
Sending packet: $m4007b4,12#5d...Packet received: 554889e5be1c094000bf40106000e8e1feff
Sending packet: $m4007b4,1#2b...Packet received: 55
Sending packet: $m4007b5,3#2e...Packet received: 4889e5
Sending packet: $m4007b8,1#2f...Packet received: be
Breakpoint 1 at 0x4007b8: file hello.cc, line 6.

There's only a handful of calls to gdbarch_skip_prologue.
They could all be updated to employ whatever caching/prefetching
is appropriate.

 > As I mentioned in <https://sourceware.org/ml/gdb-patches/2013-09/msg01013.html>,
 > I'd rather we analyze the use cases independently (I'm not saying
 > they're not the same).
 > 
 > If we find commonalities, we can certainly factor things out and
 > come up with more general abstractions then.

IME this community frowns on such approaches.
I may have a biased data set though, and the general case is different.
It would be good to get some clarity here.

 > If I were to try one, I think it would be along the lines of
 > a new TARGET_OBJECT_DISASM_MEMORY, and somehow pass more info down
 > the target_xfer interface so that the the core memory reading code
 > handles the caching.  Probably, that'd be done with a new pair of
 > 'begin/end code caching' functions that would be called at the
 > appropriate places.  The new code in dis_asm_read_memory would
 > then be pushed to target.c, close to where stack cache is handled.

How hard would it be to do that now?

 > The main point there should be consensus on, is that a caching
 > scheme is a better solution for the disassembly use case, than trusting
 > read only sessions is, for the later doesn't have the problem with
 > self-modifying code, and, in addition, it also speeds up disassembling
 > when there is _no_ corresponding binary/'text section'.

How often do we see bug reports of slow disassembly when there is no
corresponding binary/text section? Another thing the community
frowns on IME is adding code to fix theoretical performance problems.
[but again, I may have a biased data set]

Plus self modifying code won't always provide the bounds necessary
to trigger the prefetching this patch does (not all jitters use
gdb's jit interface to register all instances of self-modified code).
"x/10i $random_address" is still slow.

 > I don't think there's any need to hold considering this quite
 > localized fix for slow disassembling as is while we investigate
 > other use cases.  IOW, we should let the code start small, and
 > grow/evolve from there.

I'm certainly willing to agree, but I just want to make sure
we're following established community rules.
Plus, how hard would it be to fix prologue skipping too?
It doesn't seem that hard, and if we go the TARGET_OBJECT_DISASM_MEMORY
route most of this patch will get tossed anyway.

Also, I feel I need to point out that we rejected an early version
of Yao's varobj patch because it used casting to effect baseclass/subclassing.
Perhaps there's less of it here, and there's some minimal amount that we
allow.  But again, I want to make sure we employ the rules consistently.
[And get some clarity. :-)]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]