This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Notes on a frame_unwind_address_in_block problem


I'm just going to write up this problem, which I've been working on for the
last couple of hours, in hopes that someone else has a bright idea - or so
I can at least find my notes next time it bugs me.  Mark, if you have a
moment, there's a question for you all the way down at the bottom, in
the 32-bit section :-)


I. 64-bit

On my AMD64 GNU/Linux system, sigaltstack.exp is currently failing both
32-bit and 64-bit with similar symptoms.  Saying "finish" from a signal
handler fails to stop in the trampoline.  We correctly insert the
breakpoint, but the frame ID doesn't match: get_frame_id (get_prev_frame
(get_current_frame ())) when in the signal handler does not equal
get_frame_id (get_current_frame ()) once we've returned to the trampoline.

The causes are quite different despite the similar symptoms.  On 64-bit,
it appears to be partly related to bogus call frame information in glibc's
__restore_rt - there's dwarf2 info but (A) it doesn't start until the first
instruction of the trampoline, when for signal trampolines it ought to start
one byte earlier, and (B) it doesn't describe the signal frame at all.

Another problem is that sometimes we get the amd64-specific frame unwinder
for this code and sometimes we get the dwarf2 unwinder.  When $rip is in
__restore_rt, the dwarf2 frame sniffer successfully attaches to it.  At that
point, because of the bogus CFI, we can't backtrace.  But when it's up the
frame, the dwarf2 sniffer doesn't get it (because the unwind information
doesn't start one byte too early, as we seem to have concluded that it ought
to).  So instead the amd64 fallback sniffer gets control, identifies it as a
sigtramp, and sets up for real backtraces.

I think that correct CFI in glibc for the signal restore trampolines will
sort out the 64-bit case.  Or no CFI, but glibc would probably want correct
CFI for other reasons.


II. 32-bit.

For 32-bit, though, it gets even more interesting.  Here's where $SUBJECT
comes into play.  I have a loaded vDSO (virtual shared object), which
exports __kernel_sigreturn.  This points at the first instruction of the
trampoline, i.e. one byte after the start of the dwarf2 FDE.  When I am
stopped in the signal handler, frame_unwind_address_in_block decides to
subtract one.  That points before the symbol, so the frame ID's function
ends up being NULL; there's no symbol covering that address.  Then when we
arrive at the signal trampoline during "finish", we no longer subtract one
- since we're at an executable instruction in the topmost frame - and thus
we do find the symbol.  The two frame IDs don't compare equal.

One possible solution is the nasty patch I have in my working directory,
which boils down to this:

   if (next_frame->level >= 0
-      && get_frame_type (next_frame) == NORMAL_FRAME)
+      && get_frame_type (next_frame) == NORMAL_FRAME
+      && (next_frame->prev == NULL
+         || next_frame->prev->unwind == NULL
+         || get_frame_type (next_frame->prev) == NORMAL_FRAME))
     --pc;

But this makes frame_unwind_address_in_block change its behavior over time
for the same frame, which is awful.

Another solution would be to use the FDE start address as the code address
for dwarf2 signal frame IDs, instead of the function.  This would work on
the assumption that a single FDE would generally cover the entire trampoline
- a reasonable assumption, I think, and the consequences for the frame ID
changing while single-stepping are less disruptive here than the
alternative.

Mark, what do you think of that idea?  It seems to work.  It looks like the
patch at the end of this message.

-- 
Daniel Jacobowitz
CodeSourcery

2006-07-06  Daniel Jacobowitz  <dan@codesourcery.com>

	* dwarf2-frame.c (struct dwarf2_frame_cache): Add fde_start.
	(dwarf2_frame_cache): Set it.
	(dwarf2_signal_frame_this_id): New function.
	(dwarf2_signal_frame_unwind): Use it.

Index: dwarf2-frame.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2-frame.c,v
retrieving revision 1.63
diff -u -p -r1.63 dwarf2-frame.c
--- dwarf2-frame.c	28 May 2006 05:56:50 -0000	1.63
+++ dwarf2-frame.c	6 Jul 2006 22:20:10 -0000
@@ -725,6 +725,9 @@ struct dwarf2_frame_cache
   /* Set if the return address column was marked as undefined.  */
   int undefined_retaddr;
 
+  /* The FDE start address.  */
+  CORE_ADDR fde_start;
+
   /* Saved registers, indexed by GDB register number, not by DWARF
      register number.  */
   struct dwarf2_frame_state_reg *reg;
@@ -775,6 +778,7 @@ dwarf2_frame_cache (struct frame_info *n
   /* Find the correct FDE.  */
   fde = dwarf2_frame_find_fde (&fs->pc);
   gdb_assert (fde != NULL);
+  cache->fde_start = fde->initial_location;
 
   /* Extract any interesting information from the CIE.  */
   fs->data_align = fde->cie->data_alignment_factor;
@@ -931,6 +935,19 @@ dwarf2_frame_this_id (struct frame_info 
 }
 
 static void
+dwarf2_signal_frame_this_id (struct frame_info *next_frame, void **this_cache,
+			     struct frame_id *this_id)
+{
+  struct dwarf2_frame_cache *cache =
+    dwarf2_frame_cache (next_frame, this_cache);
+
+  if (cache->undefined_retaddr)
+    return;
+
+  (*this_id) = frame_id_build (cache->cfa, cache->fde_start);
+}
+
+static void
 dwarf2_frame_prev_register (struct frame_info *next_frame, void **this_cache,
 			    int regnum, int *optimizedp,
 			    enum lval_type *lvalp, CORE_ADDR *addrp,
@@ -1095,7 +1112,7 @@ static const struct frame_unwind dwarf2_
 static const struct frame_unwind dwarf2_signal_frame_unwind =
 {
   SIGTRAMP_FRAME,
-  dwarf2_frame_this_id,
+  dwarf2_signal_frame_this_id,
   dwarf2_frame_prev_register
 };
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]