This is the mail archive of the
gdb-prs@sourceware.org
mailing list for the GDB project.
[Bug threads/20743] can't usefully "continue" due to "ptrace: No such process" after gdb switches thread (gdb7.11.1 on FreeBSD 11)
- From: "misc-sourceware at talk2dom dot com" <sourceware-bugzilla at sourceware dot org>
- To: gdb-prs at sourceware dot org
- Date: Mon, 31 Oct 2016 18:50:20 +0000
- Subject: [Bug threads/20743] can't usefully "continue" due to "ptrace: No such process" after gdb switches thread (gdb7.11.1 on FreeBSD 11)
- Auto-submitted: auto-generated
- References: <bug-20743-4717@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=20743
--- Comment #1 from misc-sourceware at talk2dom dot com ---
[also notifying FreeBSD port maintainer of this bug]
The crux of the issue seems to be resume_all_threads_cb() in fbsd-nat.c trying
to resume a thread that has exited. This causes ptrace(PT_RESUME) to fail with
"no such process". (As a side-note, it doesn't matter which thread is current
before "continue" command as gdb seems to switch to any new thread spawned -
why is that?)
Exited threads are still in the thread list when resume_all_threads_cb() is
called, e.g. if the current thread (in inferior_ptid) exits.
To demonstrate this, change resume_all_threads_cb() to add debugging as follows
so it shows which thread it's about to resume and to confirm which call to
ptrace() returns an error:
static int
resume_all_threads_cb (struct thread_info *tp, void *data)
{
ptid_t *filter = (ptid_t *) data;
if (!ptid_match (tp->ptid, *filter))
return 0;
if (debug_fbsd_lwp)
fprintf_unfiltered (gdb_stdlog,
"FLWP: PT_RESUME for ptid (%d, %ld, %ld)\n",
ptid_get_pid (tp->ptid), ptid_get_lwp (tp->ptid),
ptid_get_tid (tp->ptid));
if (ptrace (PT_RESUME, ptid_get_lwp (tp->ptid), NULL, 0) == -1)
perror_with_name (("ptrace PT_RESUME"));
return 0;
}
Now the debugging output looks like this:
(gdb) set debug infrun 3
(gdb) set debug fbsd-lwp on
(gdb) c
Continuing.
infrun: clear_proceed_status_thread (LWP 101201 of process 35559)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101201 of process 35559] at 0x8032880da
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
infrun: prepare_to_wait
FLWP: adding thread for LWP 101576
[New LWP 101576 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun: 35559.101576.0 [LWP 101576 of process 35559],
infrun: status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101201 of process 35559 to LWP 101576 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101576 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101576, 0)
infrun: prepare_to_wait
FLWP: deleting thread for LWP 101576
[LWP 101576 of process 35559 exited]
FLWP: adding thread for LWP 101586
[New LWP 101586 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun: 35559.101586.0 [LWP 101586 of process 35559],
infrun: status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101576 of process 35559 to LWP 101586 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101586 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101576, 0)
[Switching to LWP 101586 of process 35559]
0x000000080265aa10 in ?? () from /lib/libthr.so.3
ptrace PT_RESUME: No such process.
(gdb)
Note the last few lines showing a call for LWP 101576 - a thread that has
exited.
This may not be the ideal fix but as a work-around change the top of
resume_all_threads_cb() to:
resume_all_threads_cb (struct thread_info *tp, void *data)
{
ptid_t *filter = (ptid_t *) data;
/* don't resume an exited thread */
if (tp->state == THREAD_EXITED)
return 0;
[existing code, starting with if() continues from here]
Output showing issue is worked-around:
(gdb) set debug infrun 3
(gdb) set debug fbsd-lwp on
(gdb) c
Continuing.
infrun: clear_proceed_status_thread (LWP 101201 of process 35559)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101201 of process 35559] at 0x8032880da
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
infrun: prepare_to_wait
FLWP: adding thread for LWP 100444
[New LWP 100444 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun: 35559.100444.0 [LWP 100444 of process 35559],
infrun: status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 101201 of process 35559 to LWP 100444 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 100444 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 100444, 0)
infrun: prepare_to_wait
FLWP: deleting thread for LWP 100444
[LWP 100444 of process 35559 exited]
FLWP: adding thread for LWP 101642
[New LWP 101642 of process 35559]
infrun: target_wait (-1.0.0, status) =
infrun: 35559.101642.0 [LWP 101642 of process 35559],
infrun: status->kind = spurious
infrun: TARGET_WAITKIND_SPURIOUS
infrun: Switching context from LWP 100444 of process 35559 to LWP 101642 of
process 35559
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread
[LWP 101642 of process 35559] at 0x80265aa10
FLWP: fbsd_resume for ptid (-1, 0, 0)
FLWP: PT_RESUME for ptid (35559, 101201, 0)
FLWP: PT_RESUME for ptid (35559, 101642, 0)
[...and so on...]
--
You are receiving this mail because:
You are on the CC list for the bug.