This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: make sure we don't resume the stepped thread by accident
- From: Pedro Alves <pedro at codesourcery dot com>
- To: gdb-patches at sourceware dot org
- Cc: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- Date: Mon, 23 May 2011 16:27:12 +0100
- Subject: Re: make sure we don't resume the stepped thread by accident
- References: <201105191628.55077.pedro@codesourcery.com>
On Thursday 19 May 2011 16:28:54, Pedro Alves wrote:
> Say:
>
> <stopped at a breakpoint in thread 2>
> (gdb) thread 3
> (gdb) step
>
> The above triggers the prepare_to_proceed/deferred_step_ptid
> process, which switches back to thread 2, to step over its
> breakpoint before getting back to thread 3 and "step" it.
>
> If while stepping over the breakpoint in thread 2, a signal
> arrives, and it is set to pass/nostop, we'll set a step-resume
> breakpoint at the supposed signal-handler resume address,
> and call keep_going. The problem is that we were supposely
> stepping thread 3, and that keep_going delivers
> a signal to thread 2, and due to scheduler-locking off,
> resumes everything else, _including_ thread 3, the thread
> we want stepping. This means that we lose control of thread 3
> until the next event, when we stop everything. The end result
> for the user, is that GDB lost control of the "step".
>
> Here's the current infrun debug output of the above, with the testcase
> in the patch below:
>
> infrun: clear_proceed_status_thread (Thread 0x2aaaab8f5700 (LWP 11663))
> infrun: clear_proceed_status_thread (Thread 0x2aaaab6f4700 (LWP 11662))
> infrun: clear_proceed_status_thread (Thread 0x2aaaab4f2b20 (LWP 11659))
> infrun: proceed (addr=0xffffffffffffffff, signal=144, step=1)
> infrun: prepare_to_proceed (step=1), switched to [Thread 0x2aaaab6f4700 (LWP 11662)]
> infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
> infrun: wait_for_inferior ()
> infrun: target_wait (-1, status) =
> infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
> infrun: status->kind = stopped, signal = SIGUSR1
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x40098f
> infrun: random signal 30
>
> Program received signal SIGUSR1, User defined signal 1.
> infrun: signal arrived while stepping over breakpoint
> infrun: inserting step-resume breakpoint at 0x40098f
> infrun: resume (step=0, signal=30), trap_expected=0, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
>
> ^^^ this is a wildcard resume.
>
> infrun: prepare_to_wait
> infrun: target_wait (-1, status) =
> infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
> infrun: status->kind = stopped, signal = SIGTRAP
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x40098f
> infrun: BPSTAT_WHAT_STEP_RESUME
> infrun: resume (step=1, signal=0), trap_expected=1, current thread [Thread 0x2aaaab6f4700 (LWP 11662)] at 0x40098f
>
> ^^^ step-resume hit, meaning the handler returned, so we go back to stepping thread 3.
>
>
> infrun: prepare_to_wait
> infrun: target_wait (-1, status) =
> infrun: 11659 [Thread 0x2aaaab6f4700 (LWP 11662)],
> infrun: status->kind = stopped, signal = SIGTRAP
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
>
> infrun: stop_pc = 0x40088b
> infrun: switching back to stepped thread
> infrun: Switching context from Thread 0x2aaaab6f4700 (LWP 11662) to Thread 0x2aaaab8f5700 (LWP 11663)
> infrun: resume (step=1, signal=0), trap_expected=0, current thread [Thread 0x2aaaab8f5700 (LWP 11663)] at 0x400938
> infrun: prepare_to_wait
> infrun: target_wait (-1, status) =
> infrun: 11659 [Thread 0x2aaaab8f5700 (LWP 11663)],
> infrun: status->kind = stopped, signal = SIGTRAP
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x40093a
> infrun: keep going
> infrun: resume (step=1, signal=0), trap_expected=0, current thread [Thread 0x2aaaab8f5700 (LWP 11663)] at 0x40093a
> infrun: prepare_to_wait
> infrun: target_wait (-1, status) =
> infrun: 11659 [Thread 0x2aaaab8f5700 (LWP 11663)],
> infrun: status->kind = stopped, signal = SIGTRAP
> infrun: infwait_normal_state
> infrun: TARGET_WAITKIND_STOPPED
> infrun: stop_pc = 0x40091e
> infrun: stepped to a different line
> infrun: stop_stepping
> [Switching to Thread 0x2aaaab8f5700 (LWP 11663)]
> 69 (*myp) ++; /* set breakpoint child_two here */
>
> ^^^ we stopped at the wrong line. We still stepped a bit
> because the test is running in a loop, and when we got
> back to stepping thread 3, it happened to be in the
> stepping range. (The loop increments a counter, and the
> test makes sure it increments exactly once. Without the
> fix, the counter increments a bunch, since the user-stepped
> thread runs free without GDB noticing.)
>
> The fix is to lock the scheduler until we finish the
> original step-off (in practice, while running a signal handler
> that runs while we were trying to step over a breakpoint, or while
> stepping over any breakpoint that hits that may not cause a
> stop and thus may need stepping over itself), so that the
> user-stepped thread doesn't run free.
>
> The current handling of deferred_step_ptid ignores the
> fact that we were supposed to switch back if we stop
> for any other reason than stepping, which doesn't work
> correctly if the signal we receive is pass/nostop
> (and print, if the target needs that to even show us
> the signal). To make this work, we need to leave
> deferred_step_ptid set until either the original step-off is
> fully finished, or something causes a stop to the user
> (in practice, the latter is done on next proceed).
> We don't actually need to handle switching back to the
> deferred_step_ptid specially, since we now have the "switching back
> to stepped thread" code.
>
> The current code however makes sure that we also skip a
> breakpoint at the PC of the deferred_step_ptid thread, by virtue
> of issuing an immediate resume/step once the original step-off
> is finished, without inserting breakpoints. The change in
> proceed makes sure we preserve that behavior (the "switching
> back to stepped thread" code calls keep_going, which
> makes sure breakpoints are removed if stepping_over_breakpoints
> is set.
>
> Finaly, deferring the thread switching back to the
> "switching back to stepped thread" code, makes it so
> that that step-off handling also passes by the regular
> breakpoint handling, which paves the way for making GDB
> leave watchpoints (and perhaps non-breakpoint catchpoints)
> inserted while stepping-off a breakpoint, so to not
> miss them.
>
> WDYT?
>
> (PR10227 is similar and could take a similar fix. We'd
> need to also set a step-resume breakpoint at the signal
> handler's return when we detect some _other_ thread
> is stepped/nexted, and we'd need a new flag to make
> sure `resume' forces a shedlock in that situation
> as well.)
>
>
The previous patch no longer applies cleanly. Here's a refresh.
Pedro Alves
2011-05-23 Pedro Alves <pedro@codesourcery.com>
gdb/
* infrun.c (resume): Lock the scheduler if deferred_step_ptid is
set.
(proceed): Remember to step over the breakpoint at PC, in case we
need to switch to the old thread and step-off a breakpoint there
first.
(handle_inferior_event) <step_off_bkpt_other_thread>: New local.
Rewrite deferred_step_ptid handling, but letting it fall-through
to the "switching back to stepped thread" handling. Ignore
breakpoints if we just finished stepping-off breakpoint in the old
thread.
gdb/testsuite/
* gdb.threads/step-after-sr-lock.c: New.
* gdb.threads/step-after-sr-lock.exp: New.
---
gdb/infrun.c | 95 ++++++++-------
gdb/testsuite/gdb.threads/step-after-sr-lock.c | 145 +++++++++++++++++++++++
gdb/testsuite/gdb.threads/step-after-sr-lock.exp | 122 +++++++++++++++++++
3 files changed, 321 insertions(+), 41 deletions(-)
Index: src/gdb/infrun.c
===================================================================
--- src.orig/gdb/infrun.c 2011-05-23 16:21:26.000000000 +0100
+++ src/gdb/infrun.c 2011-05-23 16:23:43.318753681 +0100
@@ -1844,6 +1844,23 @@ a command like `return' or `jump' to con
if (step && breakpoint_inserted_here_p (aspace, pc))
step = 0;
}
+ else if (!ptid_equal (deferred_step_ptid, null_ptid))
+ {
+ /* If for some reason we find ourselves resuming the
+ inferior while trying to step-off the breakpoint in the
+ thread that had last reported an event to the user, make
+ sure not to resume the deferred_step_ptid thread, since
+ we want to step it under control.
+
+ E.g., a signal arrives while stepping-off the breakpoint,
+ and we set a step-resume breakpoint at the signal
+ handler's return address, so that once hit, we go back to
+ stepping off the first breakpoint. The resume that
+ continues with the signal until the sr breakpoint would
+ resume the deferred_step_ptid thread as well if it
+ weren't for this check. */
+ resume_ptid = inferior_ptid;
+ }
if (debug_displaced
&& use_displaced_stepping (gdbarch)
@@ -2133,6 +2150,12 @@ proceed (CORE_ADDR addr, enum target_sig
thread that reported the most recent event. If a step-over
is required it returns TRUE and sets the current thread to
the old thread. */
+
+ /* If we do switch threads, remember to go back to stepping-off
+ the breakpoint at the current thread's PC. */
+ if (oneproc)
+ inferior_thread ()->stepping_over_breakpoint = 1;
+
if (prepare_to_proceed (step))
oneproc = 1;
}
@@ -3088,6 +3111,7 @@ handle_inferior_event (struct execution_
int stepped_after_stopped_by_watchpoint = 0;
struct symtab_and_line stop_pc_sal;
enum stop_kind stop_soon;
+ int step_off_bkpt_other_thread = 0;
if (ecs->ws.kind == TARGET_WAITKIND_IGNORE)
{
@@ -3651,43 +3675,6 @@ handle_inferior_event (struct execution_
}
}
- if (!ptid_equal (deferred_step_ptid, null_ptid))
- {
- /* In non-stop mode, there's never a deferred_step_ptid set. */
- gdb_assert (!non_stop);
-
- /* If we stopped for some other reason than single-stepping, ignore
- the fact that we were supposed to switch back. */
- if (ecs->event_thread->suspend.stop_signal == TARGET_SIGNAL_TRAP)
- {
- if (debug_infrun)
- fprintf_unfiltered (gdb_stdlog,
- "infrun: handling deferred step\n");
-
- /* Pull the single step breakpoints out of the target. */
- if (singlestep_breakpoints_inserted_p)
- {
- remove_single_step_breakpoints ();
- singlestep_breakpoints_inserted_p = 0;
- }
-
- ecs->event_thread->control.trap_expected = 0;
-
- /* Note: We do not call context_switch at this point, as the
- context is already set up for stepping the original thread. */
- switch_to_thread (deferred_step_ptid);
- deferred_step_ptid = null_ptid;
- /* Suppress spurious "Switching to ..." message. */
- previous_inferior_ptid = inferior_ptid;
-
- resume (1, TARGET_SIGNAL_0);
- prepare_to_wait (ecs);
- return;
- }
-
- deferred_step_ptid = null_ptid;
- }
-
/* See if a thread hit a thread-specific breakpoint that was meant for
another thread. If so, then step that thread past the breakpoint,
and continue it. */
@@ -3974,6 +3961,24 @@ handle_inferior_event (struct execution_
}
}
+ if (ecs->event_thread->suspend.stop_signal == TARGET_SIGNAL_TRAP
+ && !ptid_equal (deferred_step_ptid, null_ptid)
+ && ecs->event_thread->control.trap_expected
+ && ecs->event_thread->control.step_resume_breakpoint == NULL)
+ {
+ /* We wanted to switch to the stepped thread once we've
+ stepped-off this breakpoint. Set this flag so we can know to
+ ignore breakpoints at the current PC. The actual switching
+ is done after deciding no breakpoint caused a stop. */
+ step_off_bkpt_other_thread = 1;
+
+ if (debug_infrun)
+ fprintf_unfiltered (gdb_stdlog,
+ "infrun: handling deferred step\n");
+
+ deferred_step_ptid = null_ptid;
+ }
+
/* Look at the cause of the stop, and decide what to do.
The alternatives are:
1) stop_stepping and return; to really stop and return to the debugger,
@@ -4036,10 +4041,18 @@ handle_inferior_event (struct execution_
return;
}
- /* See if there is a breakpoint at the current PC. */
- ecs->event_thread->control.stop_bpstat
- = bpstat_stop_status (get_regcache_aspace (get_current_regcache ()),
- stop_pc, ecs->ptid);
+ /* See if there is a breakpoint at the current PC. But ignore
+ breakpoints if we just finished stepping over a breakpoint
+ and we want to switch to the stepped thread once that's done.
+ FIXME: GDB removes all breakpoints when stepping over a
+ breakpoint, including watchpoints. Obviously, watchpoints
+ can go missed that way. Once that's fixed, this will have to
+ be adjusted to pass a "ignore breakpoints" flag down to
+ bpstat_stop_status instead of not calling it at all. */
+ if (!step_off_bkpt_other_thread)
+ ecs->event_thread->control.stop_bpstat
+ = bpstat_stop_status (get_regcache_aspace (get_current_regcache ()),
+ stop_pc, ecs->ptid);
/* Following in case break condition called a
function. */
Index: src/gdb/testsuite/gdb.threads/step-after-sr-lock.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ src/gdb/testsuite/gdb.threads/step-after-sr-lock.c 2011-05-23 16:22:17.608753678 +0100
@@ -0,0 +1,145 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+ Copyright 2009, 2010, 2011 Free Software Foundation, Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include <pthread.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <signal.h>
+
+unsigned int args[2];
+
+pid_t pid;
+pthread_barrier_t barrier;
+pthread_t child_thread_2, child_thread_3;
+
+void
+handler (int signo)
+{
+ /* so that thread 3 is sure to run, in case the bug is present. */
+ usleep (10);
+}
+
+void
+callme (void)
+{
+}
+
+void
+block_signals (void)
+{
+ sigset_t mask;
+
+ sigfillset (&mask);
+ sigprocmask (SIG_BLOCK, &mask, NULL);
+}
+
+void
+unblock_signals (void)
+{
+ sigset_t mask;
+
+ sigfillset (&mask);
+ sigprocmask (SIG_UNBLOCK, &mask, NULL);
+}
+
+void *
+child_function_3 (void *arg)
+{
+ int my_number = (long) arg;
+ volatile int *myp = (int *) &args[my_number];
+
+ pthread_barrier_wait (&barrier);
+
+ while (*myp > 0)
+ {
+ (*myp) ++; /* set breakpoint child_two here */
+ callme ();
+ }
+
+ pthread_exit (NULL);
+}
+
+void *
+child_function_2 (void *arg)
+{
+ int my_number = (long) arg;
+ volatile int *myp = (int *) &args[my_number];
+
+ unblock_signals ();
+
+ pthread_barrier_wait (&barrier);
+
+ while (*myp > 0)
+ {
+ (*myp) ++;
+ callme (); /* set breakpoint child_one here */
+ }
+
+ *myp = 1;
+ while (*myp > 0)
+ {
+ (*myp) ++;
+ callme ();
+ }
+
+ pthread_exit (NULL);
+}
+
+
+int
+main ()
+{
+ int res;
+ long i;
+
+ /* Block signals in all threads but one, so that we're sure which
+ thread gets the signal we send from the command line. */
+ block_signals ();
+
+ signal (SIGUSR1, handler);
+
+ /* Call these early so that PLTs for these are resolved soon,
+ instead of in the threads. RTLD_NOW should work as well. */
+ usleep (0);
+ pthread_barrier_init (&barrier, NULL, 1);
+ pthread_barrier_wait (&barrier);
+
+ pthread_barrier_init (&barrier, NULL, 2);
+
+ /* The test uses this global to know where to send the signal
+ to. */
+ pid = getpid ();
+
+ i = 0;
+ args[i] = 1;
+ res = pthread_create (&child_thread_2,
+ NULL, child_function_2, (void *) i);
+ pthread_barrier_wait (&barrier);
+ callme (); /* set wait-thread-2 breakpoint here */
+
+ i = 1;
+ args[i] = 1;
+ res = pthread_create (&child_thread_3,
+ NULL, child_function_3, (void *) i);
+ pthread_barrier_wait (&barrier);
+ callme (); /* set wait-thread-3 breakpoint here */
+
+ pthread_join (child_thread_2, NULL);
+ pthread_join (child_thread_3, NULL);
+
+ exit(EXIT_SUCCESS);
+}
Index: src/gdb/testsuite/gdb.threads/step-after-sr-lock.exp
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ src/gdb/testsuite/gdb.threads/step-after-sr-lock.exp 2011-05-23 16:22:17.608753678 +0100
@@ -0,0 +1,122 @@
+# Copyright (C) 2011 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Test that GDB doesn't inadvertently resume the stepped thread when a
+# signal arrives while stepping over the breakpoint that last caused a
+# stop, when the thread that hit that breakpoint is not the stepped
+# thread.
+
+set testfile "step-after-sr-lock"
+set srcfile ${testfile}.c
+set binfile ${objdir}/${subdir}/${testfile}
+
+if [target_info exists gdb,nosignals] {
+ verbose "Skipping ${testfile}.exp because of nosignals."
+ return -1
+}
+
+if { [is_remote target] } {
+ return -1
+}
+
+if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" \
+ executable [list debug "incdir=${objdir}"]] != "" } {
+ return -1
+}
+
+proc get_value {var test} {
+ global expect_out
+ global gdb_prompt
+ global decimal
+
+ set value -1
+ gdb_test_multiple "print $var" "$test" {
+ -re ".*= ($decimal).*\r\n$gdb_prompt $" {
+ set value $expect_out(1,string)
+ pass "$test"
+ }
+ }
+ return ${value}
+}
+
+# Start with a fresh gdb.
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+
+gdb_load ${binfile}
+
+runto_main
+
+gdb_breakpoint [gdb_get_line_number "set wait-thread-2 breakpoint here"]
+gdb_continue_to_breakpoint "run to wait-thread-2 breakpoint"
+gdb_test "info threads" "" "info threads with thread 2"
+
+gdb_breakpoint [gdb_get_line_number "set wait-thread-3 breakpoint here"]
+gdb_continue_to_breakpoint "run to breakpoint"
+gdb_test "info threads" "" "info threads with thread 3"
+
+set testpid [get_value "pid" "get pid of inferior"]
+
+gdb_test "set scheduler-locking on"
+
+gdb_breakpoint [gdb_get_line_number "set breakpoint child_two here"]
+gdb_breakpoint [gdb_get_line_number "set breakpoint child_one here"]
+
+gdb_test "thread 3" "" "switch to thread 3 to run to its breakpoint"
+gdb_continue_to_breakpoint "run to breakpoint in thread 3"
+
+gdb_test "thread 2" "" "switch to thread 2 to run to its breakpoint"
+gdb_continue_to_breakpoint "run to breakpoint in thread 2"
+
+delete_breakpoints
+
+gdb_test "b *\$pc" "" "set breakpoint to be stepped over"
+# Make sure the first loop breaks without hitting the breakpoint
+# again.
+gdb_test "p *myp = 0" " = 0" "force loop break in thread 2"
+
+# We want "print" to make sure the target reports the signal to the
+# core.
+gdb_test "handle SIGUSR1 print nostop pass" "" ""
+
+# Queue a signal in thread 2.
+remote_exec host "kill -SIGUSR1 ${testpid}"
+
+gdb_test "thread 3" "" "switch to thread 3 for stepping"
+set my_number [get_value "my_number" "get my_number"]
+set cnt_before [get_value "args\[$my_number\]" "get count before step"]
+gdb_test "set scheduler-locking off"
+
+# Make sure we're exercising the paths we want to.
+gdb_test "set debug infrun 1"
+
+gdb_test \
+ "step" \
+ ".*prepare_to_proceed \\(step=1\\), switched to.*signal arrived while stepping over breakpoint.*switching back to stepped thread.*stepped to a different line.*callme.*" \
+ "step"
+
+set cnt_after [get_value "args\[$my_number\]" "get count after step"]
+
+# Test that GDB doesn't inadvertently resume the stepped thread when a
+# signal arrives while stepping over a breakpoint in another thread.
+
+set test "stepped thread under control"
+if { $cnt_before + 1 == $cnt_after } {
+ pass $test
+} else {
+ fail $test
+}