This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Using dwfl to enumerate frames of current thread


Mark Wielaard <mjw@redhat.com> writes:

> On Sat, Aug 22, 2015 at 12:18:46PM +0200, Ben Gamari wrote:
>> 
>> I actually ask because while we don't produce or use the .debug_ranges
>> section, they sneak into our executables from the C runtime system
>> objects. However, whatever is emitting it seems to be doing a
>> questionable job,
>> 
>>     $ objdump --dwarf-check -e inplace/lib/bin/ghc-stage2 -x > /dev/null
>>     objdump: Warning: There is an overlap [0x780 - 0x750] in .debug_ranges section.
>
> I wonder why binutils objdump is really warning about that.
> In this case it looks like the pc ranges might indeed overlap, but that is
> somewhat expected. I don't think it is actually wrong. e.g. this one seems
> to be a two nested lexical blocks, the outer block [750] obviously will
> overlap with the pc ranges of the inner block [780]:
>
>  [c1bdf9]            lexical_block
>                      ranges               (sec_offset) range list [   750]
>  [c1bdfe]              variable
>                        abstract_origin      (ref4) [c1bad1]
>                        location             (sec_offset) location list [  7269]
>                        [...]
>  [c1be46]              lexical_block
>                        ranges               (sec_offset) range list [   780]
>  [c1be4b]                variable
>                          abstract_origin      (ref4) [c1bb2e]
>                          location             (exprloc) 
>                           [   0] fbreg -68
Ahh, interesting. Right, I've been ignoring this for now but perhaps
I'll ask the binutils folks for some clarification just to make sure.
>
>> You'll be pleased to know that libdw has worked out quite well. For
>> instance, with my patch [1] GHC can produce a backtrace like,
>> 
...
>> 
>> There's still a fair amount of work left to integrate this fully, but at
>> least the tricky DWARF work is done.
>
> Very nice!
> Any idea why the libdw.so/dwfl_ functions don't have any info?
> Where they simply build without debuginfo?
>
Right, I believe in this case I was running against my distribution's
libdw, for which I have no debuginfo.

>> [1] https://phabricator.haskell.org/D1156
>
> Some quick answers to some of the questions there (I didn't read the full
> bug report or the patch, so please let me know if you have any specific
> questions):
>
Great! Thank you very much for taking the time to do this.

> - portability of elfutils.
>   It is ported across a lot of arches on GNU/Linux.
>   i386, x86_64, ppc, ppc64, ppc64le, s390x, arm and aarch64 are at least
>   regularly tested (should be zero fail at release time) and there are
>   other ports in the backends both in tree [alpha, ia64, sparc, tilegx]
>   and some not yet merged out of tree [mips, m68k, hppa].
>   In theory it should also work on other ELF/DWARF bases systems like
>   *BSD, Debian has some limited success with kfreebsd, and Solaris. But
>   there are some tricky dependencies of some of the dwfl functions on the
>   /proc file system and ptrace, not all of them have clean backend/ebl
>   functions. Darwin/MacOS is a bit harder since it doesn't use ELF and
>   libdw currently depends on the DWARF container being ELF (and I have
>   no idea what the ptrace/proc story is on Darwin). Windows is probably
>   pretty hard given that it doesn't natively support ELF, DWARF or
>   ptrace/proc.
>
Right, we have a number of Darwin users and I still haven't figured out
how we might support them. That being said I personally have little
interest in putting time into proprietary platforms so I'm not terribly
concerned.

On this note, would you be willing to accept a patch adding
dwfl_attach_local() functionality? My x86-64 implementation appears to
work, although it could probably use a second set of eyes. I may also be
able to provide ARM and i386 implementations.

> - As I said before between libunwind/elfutils and libbacktrace I actually
>   would have expected libbacktrace to be the easiest for you to use
>   since it is actually designed for in-process unwinding. libunwind
>   tries to do both in- and out-of-process unwinding, which I think is a
>   little confusing, and has much less other functionality than elfutils
>   with respect to model process memory, libraries, ELF, DWARF and symbol
>   inspection. And elfutils really only tries to support out-of-process
>   unwinding (but you happily managed to make it do in-process anyway, so
>   maybe our design isn't so bad). Now that you got your DWARF/CFI correct
>   I would give libbacktrace another go.
>
This would be interesting to try. I've been trying to keep the design
open to supporting multiple unwinding backends, so it should be easy to
dust off my libbacktrace code and try it out.

> - To mark an end of stack you should set the CFI rule for the return
>   register to undefined. See 6.4.4 Call Frame Calling Address.
>   On x86_64 the return register is often just equal to rip and so
>   using .cfi_undefined rip (in gas assembler) would do the trick.
>   In general you can find fun and wonderful CFI describing interesting
>   register unwinding tricks in glibc internals (try start.S, clone.S and
>   __longjmp.S).
>
Right, this is definitely a useful hint. That being said, Haskell
code is typically called from C code. Ideally we'd be able to resume
unwinding the C stack, not simply terminate unwinding. I think this
should be possible but first I need to work out the RTS entry/exit
convention.

> - Yes, perf can use elfutils to do unwinding. It does this "after the fact"
>   It has a initial registers handler and memory read handler like you
>   probably made for the in-process Dwfl_Thread_Callbacks. But they use
>   the dumped register and partial stack dump they made during runtime
>   to do the actual unwinding. So this only works if the CFI in your
>   binary is complete and it is (mostly) expressed through the contents
>   of the initial register dump and the stack values (which is almost always
>   the case).
>
> - We could in theory try to cleanup my hack to not need .debug_aranges
>   if we really want to. But I hope we don't now that you have it :)
>
This shouldn't be necessary. Thanks for the offer though!

> - Why both have pc ranges in the CU and in .debug_aranges (pointing to
>   the CUs)? Because they technically describe different things. The
>   ranges given in the CU are the covered program scope entries (code).
>   While .debug_aranges give the ranges of code and data object addresses
>   described by the CU (although I believe in practice it really is the
>   same and even .debug_aranges only has the code ranges). Secondly it
>   is really mildly more efficient since .debug_aranges is small and
>   compact and doesn't refer to other data sections (the CUs are all
>   spread out in the .debug_info section and can potentially point
>   into the .debug_ranges section when the CU uses DW_AT_low_pc plus
>   DW_AT_ranges to describe more complex pc ranges).
>
Ahhh, I see. Thanks!

Cheers,

- Ben

Attachment: signature.asc
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]