This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: [RFA] rs6000-tdep.c: Improve prologue handling with code motion.
- To: kevinb at cygnus dot com (Kevin Buettner)
- Subject: Re: [RFA] rs6000-tdep.c: Improve prologue handling with code motion.
- From: "Peter.Schauer" <Peter dot Schauer at regent dot e-technik dot tu-muenchen dot de>
- Date: Fri, 10 Nov 2000 13:26:50 MET
- Cc: gdb-patches at sourceware dot cygnus dot com
Based on experience, I am convinced that correct prologue analysis via the
generated machine code is impossible with highly optimizing compilers.
You will always find examples where you either scan too far or where you stop
before the real end of the prologue.
If you want to get it right, you definitely need help from the compiler,
using live ranges or similar methods.
The proposed patch doesn't try to solve the problem of correct prologue
analysis, it is only trying to get proper backtraces from highly optimized
functions, without any help from the compiler debug info (as is the case with
native system libraries).
For backtraces, we only need the correct recognition of the current state of
the frame setup and the saved caller pc.
I was also trying to minimize the risk of overscan and performance impacts.
The proposed patch will not be able to correctly determine the end of
the prologue in your example (with or without the find_pc_line test),
as the prologue sequence has already set up the required information
(new frame and saved pc) in skip_two+8, and you will not reach the
find_pc_line test anyway.
I don't know if we should include the find_pc_line test or not.
Would you be willing to run the GDB testsuite with your version of the
compiler, using -O2 ?
It will almost certainly cause many failures, but it would be interesting,
if you get better results with my original patch, or still even better
results when you leave out the find_pc_line test.
With stock gcc-2.95.2 and -O2 the results were disappointing.
gcc-2.95.2 doesn't seem to move instructions before the `setup new frame' or
`save pc' instructions, so the testsuite gets all the same results, with or
without my patch, with or without the find_pc_line test.
On the other hand we have prove that the patch doesn't introduce any
testsuite regressions, even with -O2.
> On Nov 6, 8:15am, Peter.Schauer wrote:
>
> > It does work without the section.
> > I just added it to avoid unnecessary additional slow remote target reads,
> > which had always been a concern in the past.
> >
> > I wanted to make sure that my patches don't add extra overhead in the
> > usual case (have line number info), and that we don't try to second
> > guess the information from the compiler. In addition it minimizes the
> > impact of the change (the old code did unconditionally break out of the
> > loop as well if a non-prologue insn was encountered).
> >
> > I could leave the section out, but I'd appreciate if you could provide
> > an example to prove that it should be left out.
>
> Consider the following program:
>
> --- rev.c ---
> #include <stdio.h>
> #include <stdlib.h>
>
> struct s
> {
> char *str;
> struct s *next;
> };
>
> struct s n3 = { "c", 0 };
> struct s n2 = { "b", &n3 };
> struct s n1 = { "a", &n2 };
> struct s *nodes = &n1;
>
> void print_nodes (struct s *s);
>
> int
> main (int argc, char **argv)
> {
> print_nodes (nodes);
> exit (0);
> }
>
> void
> print_nodes (struct s *s)
> {
> if (s)
> {
> print_nodes (s->next);
> printf ("%s\n", s->str);
> }
> }
>
> void
> p (double *d1, double *d2)
> {
> *d1 = 2.0;
> *d2 = 3.0;
> }
>
> void q (double d)
> {
> }
>
> void r (void)
> {
> }
>
> void
> skip_two (struct s *s)
> {
> print_nodes (s->next->next);
> {
> double d1, d2, d3, d4;
> p (&d1,&d2);
> d3 = d1 + d2;
> q (d3);
> d4 = d3 + d1 + d2;
> q (d3);
> }
> }
> --- end rev.c ---
>
> I compiled this with a version of gcc built from an internal Red Hat
> repository. I think you'll be able to get similar results if you
> use a recent development snapshot from sourceware. I used
>
> gcc -O2 -g rev.c -o rev
>
> to do the build.
>
> Here's the beginning of print_nodes():
>
> (gdb) x/10i print_nodes
> 0x100004e0 <print_nodes>: stwu r1,-16(r1)
> 0x100004e4 <print_nodes+4>: mflr r0
> 0x100004e8 <print_nodes+8>: stw r31,12(r1)
> 0x100004ec <print_nodes+12>: mr. r31,r3
> 0x100004f0 <print_nodes+16>: stw r0,20(r1)
> 0x100004f4 <print_nodes+20>: beq 0x10000514 <print_nodes+52>
> 0x100004f8 <print_nodes+24>: lwz r3,4(r31)
> 0x100004fc <print_nodes+28>: bl 0x100004e0 <print_nodes>
> 0x10000500 <print_nodes+32>: lis r3,4096
> 0x10000504 <print_nodes+36>: lwz r4,0(r31)
>
> Note that the compiler has moved the test (the "mr." instruction) into
> the prologue. GDB is aware of this too.
>
> (gdb) info line *print_nodes
> Line 26 of "rev.c" starts at address 0x100004e0 <print_nodes>
> and ends at 0x100004ec <print_nodes+12>.
> (gdb) info line *print_nodes+12
> Line 27 of "rev.c" starts at address 0x100004ec <print_nodes+12>
> and ends at 0x100004f0 <print_nodes+16>.
> (gdb) info line *print_nodes+16
> Line 26 of "rev.c" starts at address 0x100004f0 <print_nodes+16>
> and ends at 0x100004f4 <print_nodes+20>.
> (gdb) info line *print_nodes+20
> Line 27 of "rev.c" starts at address 0x100004f4 <print_nodes+20>
> and ends at 0x100004f8 <print_nodes+24>.
>
> This was my original example, but I remembered that we have some
> special purpose code in the prologue scanner in rs6000-tdep.c which
> accounts for this case. I needed a more convincing example, so I
> wrote skip_two(). The idea is that if we have to do a double
> dereference early on in the function body, the compiler may choose to
> move one or more of the dereference instructions into the prologue.
> The additional junk is to make sure that we have a reasonably sized
> prologue to move instructions into.
>
> Here is the beginning of skip_two():
> (gdb) x/10i skip_two
> 0x10000554 <skip_two>: stwu r1,-32(r1)
> 0x10000558 <skip_two+4>: mflr r0
> 0x1000055c <skip_two+8>: stw r0,36(r1)
> 0x10000560 <skip_two+12>: lwz r9,4(r3)
> 0x10000564 <skip_two+16>: stfd f31,24(r1)
> 0x10000568 <skip_two+20>: lwz r3,4(r9)
> 0x1000056c <skip_two+24>: bl 0x100004e0 <print_nodes>
> 0x10000570 <skip_two+28>: addi r3,r1,8
> 0x10000574 <skip_two+32>: addi r4,r1,16
> 0x10000578 <skip_two+36>: bl 0x10000528 <p>
>
> Note that the lwz instruction is a part of the first dereference and
> that it occurs before the prologue is complete. (The stfd instruction
> is the last instruction of the prologue.)
>
> Here, again, is what GDB knows about the lines associated with these
> instructions:
>
> (gdb) info line *skip_two
> Line 51 of "rev.c" starts at address 0x10000554 <skip_two>
> and ends at 0x10000560 <skip_two+12>.
> (gdb) info line *skip_two+12
> Line 52 of "rev.c" starts at address 0x10000560 <skip_two+12>
> and ends at 0x10000564 <skip_two+16>.
> (gdb) info line *skip_two+16
> Line 51 of "rev.c" starts at address 0x10000564 <skip_two+16>
> and ends at 0x10000568 <skip_two+20>.
> (gdb) info line *skip_two+20
> Line 52 of "rev.c" starts at address 0x10000568 <skip_two+20>
> and ends at 0x10000570 <skip_two+28>.
>
> It is my contention that the following section from your proposed
> patch...
>
> ! if (num_skip_non_prologue_insns == 0 && lim_pc == 0)
> ! {
> ! /* Stop scan if we are looking for the end of the prologue
> ! and we have line numbers for the function
> ! The current result is good enough, and the compiler will
> ! hopefully help us to get better results via the line number
> ! info. */
> ! struct symtab_and_line sal;
> ! sal = find_pc_line (pc, 0);
> ! if (sal.line != 0)
> ! break;
> ! }
>
> ...would cause the prologue scanner to stop too soon on skip_two(). I.e,
> it would incorrectly indicate that the stw instruction at skip_two+8 is
> the last prologue instruction when in fact it is actually the stfd at
> skip_two+16. (If you wish, I can apply your patch to verify this.)
>
> Kevin
>
>
--
Peter Schauer pes@regent.e-technik.tu-muenchen.de