This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [patch] Fix for internal-error: linux_nat_post_attach_wait: Assertion `pid == new_pid && WIFSTOPPED (status)' failed.
On Wednesday 14 October 2009 19:21:03, Paul Pluzhnikov wrote:
> It's very hard to test, since it only happens once in about 50
> attaches, but I did catch this several times, and GDB appears to
> have done the right thing.
Ugh. Happened much more often for me, like 1 in 5...
On Wednesday 14 October 2009 19:21:03, Paul Pluzhnikov wrote:
> /* Wait for the LWP specified by LP, which we have just attached to.
> - Returns a wait status for that LWP, to cache. */
> + Returns 1 if LWP has been stopped, else 0. */
>
> static int
> -linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned,
> - int *signalled)
> +linux_nat_post_attach_wait (ptid_t ptid, int first, int *status,
> + int *cloned, int *signalled)
> {
I don't think you needed the interface change. The new interface
has two ways of saying the LWP exited, the 0 return, and
the !WIFSTOPPED(status), which looks strange. With such an
interface, I would expect the return code to indicate a real failure,
but an exit indication isn't really a failure. Any reason you'd not
just do ...
> - status = linux_nat_post_attach_wait (ptid, 0, &cloned, &signalled);
> + if (!linux_nat_post_attach_wait (ptid, 0, &status, &cloned, &signalled))
> + return -1;
> +
status = linux_nat_post_attach_wait (ptid, 0, &cloned, &signalled);
if (!WIFSTOPPED (status))
return -1;
? Anyway, I was just surprised. Either way is fine really.
> > I would believe that we can also see the main LWP exit right
> > after attach (in linux_nat_attach). I'm not sure what exactly
> > is the best to do UI wise in that case. If we want to store
> > the event pending to report later, we'll have to use
> > the lwp->waitstatus field, not lwp->status, due to the
> > fact that lwp->status == 0 is ambiguous with
> > "no-stored-pending-event" (see status_callback).
>
> I believe I did this now. Not sure whether lp->resumed should also
> be set here.
Yes, it should. Otherwise, the pending event is ignored (see
status_callback), and worse, in this case, you'll hit this
assertion:
linux_nat_wait_1:
/* Make sure there is at least one LWP that has been resumed. */
gdb_assert (iterate_over_lwps (ptid, resumed_callback, NULL));
Oh, my bad. I now took a look at what
infcmd.c:attach_command|attach_command_post_wait would do
in the case of the target_wait right after attach returning a
process exit, and it seems it would misbehave, and try to poke
at the inferior's memory... Best not to go there for such a rare
use case. So, leaving the exit event pending ends up
fixing nothing afterall...
>
> > The simple
> > alternative is to again just pretend that the process had
> > exited before we managed to attach to it, get rid of it,
> > and error out like we would if the process didn't exist
> > at all when we tried to attach.
>
> This appears harder to do, since to_attach doesn't return anything.
Back to the "simple" variant: The inferior just added is
always inferior_ptid/current_inferior(). If you look
at fork-child.c:startup_inferior, you'll see bits of
code doing exactly what you'd need to do. E.g.:
target_terminal_ours ();
target_mourn_inferior ();
if (WIFEXITED (status))
error (_("During startup program exited with code %d."),
WIFEXITCODE (status));
else if (WIFSIGNALED (status))
error (_("During startup program exited with signal ..."),
...);
But if this looks too much trouble for what it's worth, that's
probably right :-). I'd be quite fine to ignore that use case
for now.
--
Pedro Alves