This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Move the frame zero PC check earlier


> Date: Sat, 13 May 2006 11:13:38 -0400
> From: Daniel Jacobowitz <drow@false.org>
> 
> On Sat, May 13, 2006 at 11:46:35AM +0200, Mark Kettenis wrote:
> > > Tested on x86_64-pc-linux-gnu, and by hand against SymbianOS,
> > > where it gives much nicer looking backtraces.
> > 
> > Our goal shouldn't be nicer looking backtraces.  It should be
> > providing the user with all information needed to fix bugs in their
> > programs.  Your patch is removing such a bit of information, and
> > therefore unacceptable to me.  Sorry :(.
> 
> Sorry, Mark, I completely disagree with you on this issue.  Let's at
> least discuss it, please?

No problem.

> You said that removing the 0x00000000 frame removed information.  I
> disagree.  It's not a valid frame, "up"'ing into it isn't going to give
> you anything sensible for saved registers unless the return address was
> the only thing on the stack that got clobbered (fairly rare).

Sure, and I wasn't arguing that the frame itself was of any use.  But
the fact that it gets printed in the backtrace is useful, since it
indicates that GDB fell of the stack while doing the backtrace.

> Instead, with the patch, the backtrace will appear to just suddenly
> stop.

Yes, and that's exactly my problem.  It will be much more difficult to
spot that GDB just fell off the stack.  Another problem is that this
makes the PC == 0 case even more special than the PC != 0 case, where
we still will print the bogus frame in the backtrace.

> If the function at the bottom of the backtrace isn't an entry
> point, the fact that the backtrace has just suddenly stopped is a
> pretty big clue that the stack is horked.

Sure, but you won't notice until you start actually looking at the
function names in the backtrace.  At first sight the backtrace will
look perfectly ok.

> Explanatory output ("why did that backtrace stop?") is available in
> "set debug frame 1".  If you think it's routinely useful, then we can
> make it available in some prettier form, perhaps in "info frame" for
> the outermost frame.

If we can reliably tell that a frame is the outermost frame, we might
indeed print that as part of "info frame".

> Also, I don't think that "gdb is confused" errors are as desirable as
> you think they are.  This extra frame has been reported to me as a bug
> at least three times that I can think of (twice for RTOSes and once for
> Linux KGDB).

I can imagine you'd like to get these people off your back.  And
perhaps they're right that the extra frame is caused by a bug in GDB.
But that bug is not the printing of the extra frame itself.  The bug
is GDB not being able to determine that it is at the end of the stack,
which might actually be a bug in the compiler or system libraries
they're using.

> Such messages upset users when their stack is _not_ horked.  For
> example, when GDB's prologue unwinder can't handle a prologue for a
> non-leaf function on the stack, often you'll get this "friendly"
> message:
> 
>   error (_("Previous frame identical to this frame (corrupt stack?)"));
> 
> I've had users come up to me and say that they wasted hours looking for
> the stack corruption GDB was complaining about and in fact it was just
> a weakness in the unwinder.

Then we should improve the unwinder.  If we didn't error out with that
error, the backtrace would never end.

> And Joel recently reported that Ada tasking generates this message
> on at least one platform, and users are unhappy about that, too.

IIRC this is a case where the outermost frame wasn't marked properly,
or at least not detected as such by GDB.  That's the problem that
needs to be fixed.

> I think that determining the end of stack cleanly is one of the more
> important things for GDB to get right.

Yes indeed.  And one of your other mails (to which I didn't reply yet)
tries to address that, and we certainly should do something like you
wrote there.  But the patch we're discussing here is just papering
over the problems.

> And when we've run out of useful information, the stack appears to
> end, and we're quite justified in reporting that the stack ended.
> It's quite complex enough already without reporting "but the end of
> the stack looks a little funny to me...".

No, if a stack doesn't end properly on a platform where it should end
properly, that's useful information that should be reported to the
user.

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]