This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: skip_inline_frames failed assertion resuming from breakpoint on LynxOS


[Fixing ENOPATCH... sigh.]
> I was wondering what you guys would think of a patch like this.
> I am a bit uncertain, because I don't understand everything
> that is happening - and the problem is that this is happening
> with a fairly massive and complex program that I don't have access
> to, on a system that is also fairly opaque. When I'm lucky, getting
> answers is only very hard.
> 
> I am still trying to reproduce the problem locally in order to
> find out more, but I couldn't understand why, in principle,
> one thread couldn't receive multiple notifications during
> the same single-step if the system decides to queue up signals?
> If that were the case, wouldn't the attached patch make sense?
> (currently untested against the program that triggered the issue,
> as I think I understand how inline-frame works, and what it does,
> but I am not sure I get it all).

Thanks again!
-- 
Joel
>From f7ad35aa92a7007194582b1e23a110fc06b50cd1 Mon Sep 17 00:00:00 2001
From: Joel Brobecker <brobecker@adacore.com>
Date: Thu, 20 Nov 2014 08:38:08 +0400
Subject: [PATCH] skip_inline_frames failed assertion resuming from breakpoint
 on LynxOS

A user reported a failed assertion while debugging their program
on a LynxOS system (thus via GDBserver), when trying to resume
the program's execution after having reached a breakpoint:

    (gdb) continue
    [...]
    ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed.

Turning infrun debug traces helps understand a little better what
happens:

    (gdb) continue
    Continuing.
    infrun: clear_proceed_status_thread (Thread 126)
    [...]
    infrun: clear_proceed_status_thread (Thread 142)
    [...]
    infrun: clear_proceed_status_thread (Thread 146)
    infrun: clear_proceed_status_thread (Thread 125)
    infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=0)
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: wait_for_inferior ()
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    infrun: random signal (GDB_SIGNAL_REALTIME_34)
    infrun: switching back to stepped thread
    infrun: Switching context from Thread 146 to Thread 142
    infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 142] at 0x10684838
    infrun: prepare_to_wait
    [...handling of similar events for threads 145, 144 and 143 snipped...]
    infrun: prepare_to_wait
    infrun: target_wait (-1, status) =
    infrun:   42000 [Thread 146],
    infrun:   status->kind = stopped, signal = GDB_SIGNAL_REALTIME_34
    infrun: infwait_normal_state
    infrun: TARGET_WAITKIND_STOPPED
    infrun: stop_pc = 0x10a187f4
    infrun: context switch
    infrun: Switching context from Thread 142 to Thread 146
    ../../src/gdb/inline-frame.c:339: internal-error: skip_inline_frames: Assertion `find_inline_frame_state (ptid) == NULL' failed.

It all happens while we're trying to single-step out of the breakpoint.
We keep resuming the inferior trying to single-step the thread that
hit the breakpoint, but each time we get a notification that another
thread received a particular signal. This is OK until the same thread
actually received a signal a second time, without having actually
run further (same PC). That's when we hit the assertion in
skip_inline_frames.

This patch avoids the assertion by recognizing that a thread can
indeed potentially receive multiple events without changing PC,
and by therefore changing skip_inline_frames to return immediately
if there we have already computed the inline_state for this thread's
PC.

gdb/ChangeLog:

        * inline-frame.c (skip_inline_frames): Do not raise a failed
        assertion if find_inline_frame_state finds an inlined frame
        state for PTID.  Return early instead.

Tested on x86_64-linux.
---
 gdb/inline-frame.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/gdb/inline-frame.c b/gdb/inline-frame.c
index cecb2af..c60820c 100644
--- a/gdb/inline-frame.c
+++ b/gdb/inline-frame.c
@@ -307,6 +307,24 @@ skip_inline_frames (ptid_t ptid)
   int skip_count = 0;
   struct inline_state *state;
 
+  if (find_inline_frame_state (ptid) != NULL)
+    {
+      /* This thread is receiving multiple notifications without
+	 making progress in its execution (same PC).
+
+	 This was seen happening on LynxOS where a program appears
+	 to have a number of signals being queued then delivered
+	 while trying to single-step a thread out of a breakpoint.
+	 The single-step operation makes no progress until all signals
+	 get delivered first, which can result in the same thread
+	 receiving multiple signals during the same single-step
+	 attempt.
+
+	 We have already computed the inline_state for that thread,
+	 so there is no need to redo it again.  */
+      return;
+    }
+
   /* This function is called right after reinitializing the frame
      cache.  We try not to do more unwinding than absolutely
      necessary, for performance.  */
@@ -335,7 +353,6 @@ skip_inline_frames (ptid_t ptid)
 	}
     }
 
-  gdb_assert (find_inline_frame_state (ptid) == NULL);
   state = allocate_inline_frame_state (ptid);
   state->skipped_frames = skip_count;
   state->saved_pc = this_pc;
-- 
1.9.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]