This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: Fix a crash when stepping and unwinding fails
> Date: Tue, 21 Feb 2006 16:52:16 -0500
> From: Daniel Jacobowitz <drow@false.org>
>
> > yes, it seems that step_frame_idd can end up as null_frame_id, if
> > get_current_frame() is also the outermost frame at the same time.
>
> ... so the outermost frame will generally have null_frame_id. Not sure
> if that's true when we stop because we encounter main
If we can unwind from main, its frame ID will not be null_frame_id.
> > > > How can we hit the last frame? If we're hitting the last frame, where
> > > > did we come from?
> > > >
> > > > It may very well be that there are GDB bugs that make step_frame_id
> > > > equal to null_frame_id. If we can't trace those bugs right now, we
> > > > should probably sprinkle a few gdb_assert()'s around and try to solve
> > > > the issues when we hit those.
> > >
> > > We use the null frame ID to represent the outermost frame. If we can't
> > > find another frame outer to this one, then we assume this one is the
> > > outermost.
> >
> > Yes, it seems there are issues here. The frame ID is supposed to be
> > unique for a particular frame, yet there's a possibility that two
> > distinct frames both end up with the null frame ID.
>
> Are there? I think there's only one - the outermost. We've only got
> that and the sentinel frame, and the sentinel frame I think doesn't
> have an ID.
If we can't unwind from a frame for some reason, then its frame ID
will also be null_frame_id. And in multi-threaded programs there can
be multiple outermost frames.
> Perhaps part of the problem here is that step_frame_id is set to
> null_frame_id when it is invalid; maybe we should keep that separate.
Not sure; it might not be necessary if we generate a proper frame ID
for the outermost frame.
> I don't think it would help me though. Perhaps the real problem is the
> use of null_frame_id for both the outermost frame and completely
> unknown frames. It would be nice if we could tell here:
>
> if (frame_id_eq (frame_unwind_id (get_current_frame ()), step_frame_id))
>
> that frame_unwind_id has returned something completely invalid instead
> of the outermost frame
Indeed.
> One way to do that in our current representation would be to check that
> the frame ID for the current frame is not null_ptid.
Yes, I think it makes sense to punt trying to insert a step-resume
breakpoint earlier than you do in your patch.
> > > Just to sketch out my example a bit more: the embedded OS I'm debugging
> > > lives in ROM. The application I've supplied to GDB lives in RAM. In
> > > some later stage of the project, hopefully, I will have GDB magically
> > > load some other ELF files (that I don't have yet) to cover the ROM
> > > code; but right now I can't do that and there's no guarantee I'll have
> > > debug info covering all of it anyway. So we're executing code way
> > > out in the boondocks. GDB doesn't have any way on this platform
> > > (ARM Thumb) to guess where the start of a function is if it doesn't
> > > have a symbol table; so it can't be sure that we've really reached the
> > > first instruction of a function, so it has no idea whether $lr is valid
> > > or not.
> >
> > But that really means that we shouldn't be messing with step-resume
> > breakpoints here. The whole notion of functions that can be stepped
> > into isn't there.
>
> Yes, it is. I've executed "step" in a place where I do have symbol
> information (and working unwinders). It's taken me into a place where
> I don't (a DLL in ROM). Since I don't have debug information any more
> GDB would like to step back out to the call site, except it fails
> because we've moved out of its known area.
Ah, ok. But that means that step_frame_id really shouldn't be equal
to null_frame_id. It will certainly not be null_frame_id if it isn't
the outermost frame. And whether the place where you issued "step" is
the outermost frame or not should not influence GDB's behaviour here.
Mark