This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: RFA: general prologue analysis framework
- From: Eli Zaretskii <eliz at gnu dot org>
- To: Jim Blandy <jimb at redhat dot com>
- Cc: gdb-patches at sourceware dot org
- Date: Sat, 15 Oct 2005 14:12:01 +0200
- Subject: Re: RFA: general prologue analysis framework
- References: <m3y8568as0.fsf@alligator.red-bean.com>
- Reply-to: Eli Zaretskii <eliz at gnu dot org>
> From: Jim Blandy <jimb@redhat.com>
> Date: Thu, 06 Oct 2005 16:51:11 -0700
>
> + /* When we analyze a prologue, we're really doing 'abstract
> + interpretation' or 'pseudo-evaluation': running the function's code
> + in simulation, but using conservative approximations of the values
> + it would have when it actually runs. For example, if our function
> + starts with the instruction:
> +
> + addi r1, 42 # add 42 to r1
> +
> + we don't know exactly what value will be in r1 after executing this
> + instruction, but we do know it'll be 42 greater than its original
> + value.
> +
> + If we then see an instruction like:
> +
> + addi r1, 22 # add 22 to r1
> +
> + we still don't know what r1's value is, but again, we can say it is
> + now 64 greater than its original value.
> +
> + If the next instruction were:
> +
> + mov r2, r1 # set r2 to r1's value
> +
> + then we can say that r2's value is now the original value of r1
> + plus 64.
> +
> + It's common for prologues to save registers on the stack, so we'll
> + need to track the values of stack frame slots, as well as the
> + registers. So after an instruction like this:
> +
> + mov (fp+4), r2
> +
> + Then we'd know that the stack slot four bytes above the frame
> + pointer holds the original value of r1 plus 64.
> +
> + And so on.
> +
> + Of course, this can only go so far before it gets unreasonable. If
> + we wanted to be able to say anything about the value of r1 after
> + the instruction:
> +
> + xor r1, r3 # exclusive-or r1 and r3, place result in r1
> +
> + then things would get pretty complex. But remember, we're just
> + doing a conservative approximation; if exclusive-or instructions
> + aren't relevant to prologues, we can just say r1's value is now
> + 'unknown'. We can ignore things that are too complex, if that loss
> + of information is acceptable for our application.
> +
> + So when I say "conservative approximation" here, what I mean is an
> + approximation that is either accurate, or marked "unknown", but
> + never inaccurate.
> +
> + Once you've reached the current PC, or an instruction that you
> + don't know how to simulate, you stop. Now you can examine the
> + state of the registers and stack slots you've kept track of.
> +
> + - To see how large your stack frame is, just check the value of the
> + stack pointer register; if it's the original value of the SP
> + minus a constant, then that constant is the stack frame's size.
> + If the SP's value has been marked as 'unknown', then that means
> + the prologue has done something too complex for us to track, and
> + we don't know the frame size.
> +
> + - To see where we've saved the previous frame's registers, we just
> + search the values we've tracked --- stack slots, usually, but
> + registers, too, if you want --- for something equal to the
> + register's original value. If the ABI suggests a standard place
> + to save a given register, then we can check there first, but
> + really, anything that will get us back the original value will
> + probably work.
> +
> + Sure, this takes some work. But prologue analyzers aren't
> + quick-and-simple pattern patching to recognize a few fixed prologue
> + forms any more; they're big, hairy functions. Along with inferior
> + function calls, prologue analysis accounts for a substantial
> + portion of the time needed to stabilize a GDB port. So I think
> + it's worthwhile to look for an approach that will be easier to
> + understand and maintain. In the approach used here:
> +
> + - It's easier to see that the analyzer is correct: you just see
> + whether the analyzer properly (albiet conservatively) simulates
> + the effect of each instruction.
> +
> + - It's easier to extend the analyzer: you can add support for new
> + instructions, and know that you haven't broken anything that
> + wasn't already broken before.
> +
> + - It's orthogonal: to gather new information, you don't need to
> + complicate the code for each instruction. As long as your domain
> + of conservative values is already detailed enough to tell you
> + what you need, then all the existing instruction simulations are
> + already gathering the right data for you.
> +
> + A 'struct prologue_value' is a conservative approximation of the
> + real value the register or stack slot will have. */
Jim, I'd be thrilled to see this text in gdbint.texinfo (if and when
the patch is committed), perhaps with a few more general words about
prologue analysis, which is currently completely undocumented.