This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to debug broken unwinding?


On Freitag, 2. Juni 2017 15:26:10 CEST Milian Wolff wrote:
> On Donnerstag, 1. Juni 2017 22:57:12 CEST Milian Wolff wrote:
> > Hey all,
> 
> <snip>
> 
> >  heaptrack_gui  2228 135073.400474:     613969 cycles:
> >                   108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
> > 
> > @@ -533,8 +529,6 @@
> > 
> >                   2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
> >                   297c53 QCoreApplicationPrivate::init (/usr/lib/
> > 
> > libQt5Core.so.5.8.0)
> > 
> >                    f7cde QGuiApplicationPrivate::init
> > 
> > (/usr/lib/libQt5Gui.so. 5.8.0)
> > -                 1589e8 QApplicationPrivate::init
> > (/usr/lib/libQt5Widgets.so. 5.8.0)
> > -                  78622 main (/home/milian/projects/compiled/other/bin/
> > heaptrack_gui)
> > 
> >                    20439 __libc_start_main (/usr/lib/libc-2.25.so)
> >                    78299 _start (/home/milian/projects/compiled/other/bin/
> > 
> > heaptrack_gui)
> > 
> > NOTE: this is super odd, it simply misses two frames in the middle?!
> 
> This is really quite odd - looking at the debug output, the frames in the
> middle are really just skipped for some reason:
> 
> unwind: access_mem addr 0x7ffca0a88330, val 4edc50, offset 2808
> unwind: access_mem addr 0x7ffca0a88338, val 7f69bfce443a, offset 2816
> unwind: pc: = 0x7f69c10fecde
> found map: 7f69c1007000 7f69c1766000
> dso found: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0
> reported: libQt5Gui.so.5.8.0 /usr/lib/libQt5Gui.so.5.8.0, 1
> unwind: QGuiApplicationPrivate::init():ip = 0x7f69c10fecde (0xf7cde)
> 
> -> so far so good, this frame is properly found inside libQt5Gui, but then:
> 
> unwind: pc: = 0x7f69bfce4439
> found map: 7f69bfcc4000 7f69c0069000
> dso found: libc-2.25.so /usr/lib/libc-2.25.so
> reported: libc-2.25.so /usr/lib/libc-2.25.so, 1
> unwind: __libc_start_main:ip = 0x7f69bfce4439 (0x20439)
> 
> -> the next frame is is supposedly the one in libc, but what happened to the
> two frames in QApplicationPrivate::init and main? I also note that no calls
> to access_mem are occuring - is this maybe some (wrong) caching in libdw or
> so that interfers here?
> 
> Any insight would be appreciated, thanks!

Some more debugging and going after my gut feeling brings me to the following 
conclusion: The real issue seems to be the on-demand reporting of the elf 
file. We used to do:

       Dwarf_Addr pc;
       bool isactivation;
 
       if (!dwfl_frame_pc(state, &pc, &isactivation)) {
               pr_err("%s", dwfl_errmsg(-1));
               return DWARF_CB_ABORT;
       }

       // report the module before we query for isactivation
       report_module(pc, ui);

This looks safe and fine and actually works most of the time. But passing a 
non-null isactivation flag to dwfl_frame_pc potentially leads to a second 
unwind step, before we got the change to report the module! I can workaround 
this by instead doing

       Dwarf_Addr pc;
       bool isactivation;
 
       if (!dwfl_frame_pc(state, &pc, NULL)) {
               pr_err("%s", dwfl_errmsg(-1));
               return DWARF_CB_ABORT;
       }

       // report the module before we query for isactivation
       report_module(pc, ui);

       if (!dwfl_frame_pc(state, &pc, &isactivation)) {
               pr_err("%s", dwfl_errmsg(-1));
               return DWARF_CB_ABORT;
       }

This fixes all the issues in my original email. So sorry for the noise - it 
doesn't see as if the unwinding code in elfutils is broken - quite the 
contrary! It's just our misuse of the API that is to blame.

May I suggest though to move the isactivation code into a separate function to 
prevent this issue from arising in the future? I.e. it would be nice if the 
code above could read:


       Dwarf_Addr pc;
       bool isactivation;
 
       if (!dwfl_frame_pc(state, &pc)) {
               pr_err("%s", dwfl_errmsg(-1));
               return DWARF_CB_ABORT;
       }

       // report the module before we query for isactivation
       report_module(pc, ui);

       if (!dwfl_frame_is_activation(state)) {
           --pc; 
       }


Thanks
-- 
Milian Wolff
mail@milianw.de
http://milianw.de

Attachment: signature.asc
Description: This is a digitally signed message part.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]