This is the mail archive of the frysk@sourceware.org mailing list for the frysk project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Roundtable, breakpoints and lots of unwinding (Was: meeting 2007-08-15 9:30 us east coast time)

From: Mark Wielaard <mark at klomp dot org>
To: Andrew Cagney <cagney at redhat dot com>
Cc: frysk at sourceware dot org
Date: Tue, 21 Aug 2007 11:21:14 +0200
Subject: Roundtable, breakpoints and lots of unwinding (Was: meeting 2007-08-15 9:30 us east coast time)
References: <46C257B2.5030903@redhat.com> <46C2FDB9.2090800@redhat.com>

Hi Andrew,

That was a nice summary and overview of the things people are working
on. Thanks. Maybe we can add a wiki somewhere to keep these overviews up
to date. Here some additions and details on my current work.

> mjw: bug fixes for stepping;

Low level stepping of breakpoints in particular. This all started with
the demo of TestUpdatingDisplayValue which in the end produced a
reproducer for http://sourceware.org/bugzilla/show_bug.cgi?id=4747
Which was just one way of how breakpoint stepping while a signal becomes
pending could break. Something which is especially nasty when doing out
of line stepping.

There is now a low-level instruction stepping test framework:
http://sourceware.org/bugzilla/show_bug.cgi?id=4763
http://sourceware.org/ml/frysk/2007-q3/msg00118.html
And there is now a low-level test framework for testing signals being
raised while breakpoint stepping (except user installed signal handlers
for now though):
http://sourceware.org/ml/frysk/2007-q3/msg00301.html

The work by Petr on fltrace exposed some extra issues, who also helped
adding some extra tests, it did take some more time then I had
originally though, but I believe with the above the code is a lot more
robust and better supports what Petr is doing. In particular the
following bugs have now been closed:
http://sourceware.org/bugzilla/show_bug.cgi?id=4889
http://sourceware.org/bugzilla/show_bug.cgi?id=4894

There are still 3 issues I know of with low-level stepping of
breakpoints (none of which I am currently actively working on, but I
expect fltrace to hit them, so when that work is progressing we might
want to go back to these issues):

http://sourceware.org/bugzilla/show_bug.cgi?id=4847
Stepping Trap instructions (in particular inside a trap signal handler)
is broken (again) on newer kernels. Getting stepping of a trap
instruction and stepping of the trap handler needs to be fully special
cased. And the kernel doesn't help because it changes behavior every
other release it seems. Not working on this till someone finds a real
use case for this one.

http://sourceware.org/bugzilla/show_bug.cgi?id=4762
We don't have a real Instruction parser (x86/x86_64) for single stepping
out of line framework (see IA32InstructionParser) which means that for
almost all instructions we are actually using reset breakpoint stepping
which misses breakpoints when used with multiple threads. Not currently
working on this one either, but I suspect that the new fltrace work that
Petr is doing will soon hit this limitation at which time we should
either finish the instruction parser or introduce stop-the-world
stepping to make things more robust.

http://sourceware.org/bugzilla/show_bug.cgi?id=4895
Low level breakpoints are visible to all Tasks, not just the Task that
requested it. This seemed to be a good idea back when they were added
since low-level breakpoints are essentially Proc based, but this
confuses some users because they have to check the Task argument in
their updateHit() handler, the workaround is easy, so not very
important, but it would be better if this was cleaned up.

> support for .debug_frame in libunwind
> for instance, lesson the need for asynchronous-unwind-tables in
> .eh_frame by using .debug_frame when available

The breakpoints took some more time than I had anticipated, so I am
still working on this bit. Since there is a lot of layers to unwinding
and documentation is scattered all over the place here is a summary of
unwinding as I found it. Both to document my own research and to
structure the work a bit. If any of the below is wrong, please let me
know.

Unwinding the call stack used to be something only a debugger would do
and relied on the executable having a frame pointer in a dedicated
register that points to the bottom of the stack frame for the current
function which also contained the return address [1]. Having a frame
pointer allows you to quickly walk the call stack and get all the
addresses, if you can map those to the names of the relevant functions
they are in you have a nice backtrace for the user.

If you want to get more of the state then you could rely on each
function having a prologue and epilogue that saved and restored the
registers [2] of the caller. Given a calling convention for a particular
architecture you could use these to reliably find the original registers
on the stack, which in turn with some debug info would give you the
values of variables and arguments of the functions on the call stack.

Unfortunately compilers got smart and optimized code might not keep a
frame pointer (frees up one more register) and might reschedule the
function prologue and epilogue instructions between the other
instructions in the function. All making it pretty hard for an unwinder
to reconstruct the previous call frames on the stack. In particular
x86_64 does away with a standard frame pointer. You can still get some
information back by conservatively approximating the instructions in the
function and guessing at the actual way the various registers are stored
[3] but this becomes pretty messy pretty quickly.

To help debuggers still get all the information needed to unwind a stack
and restore all needed registers the debugging information (DWARF)
generated by compilers was extended to include Call Frame Information
(CFI) [4] that allows a debugger to reconstruct the the calling pc and
registers of a function. This information is stored in the .debug_frame
section of an elf file. It uses a simplified version of the dwarf
instructions (not all operands are relevant for reconstructing the
registers). This section is not guaranteed to be available, it is not
necessarily loaded into memory and can even be split off into its own
debug info file in some distributions.

At the same time different languages got constructs (exceptions,
continuations, global gotos, asynchronous garbage collectors, etc) which
required some sort of reliable unwinding (and in some cases rewinding)
of the call stack. Since some optimizations and some newer architectures
also did away with a standard frame pointer another way to reliably
unwind the stack was needed. This became the exception handler framework
(eh_frame) which is based on the DWARF CFI work but which is slightly
different. Unfortunately nobody seems to have documented the precise
differences between the formats. So you will have to carefully read both
the DWARF standard and the LSB core specification Exception Frames [5]
side-by-side.

Note that a debugger that wants to walk a stack and recover all
registers might need more information than some of these language
constructs which might only need unwind information for specific call
sites. Depending on optimizations, architecture and language compiled
(and sometimes specific distribution default choices) no, full or
partial exception handler unwind information and/or frame pointers are
generated (see the GCC options -funwind-tables and
-fasynchronous-unwind-tables [6]).

Both the dwarf and the exception handler specs are architecture neutral.
But since you do need to a mapping between the actual registers and the
specs you also need to consult the relevant architecture abi that
defines the actual mapping. Sometimes these architecture abi specs also
define some DWARF/EH extensions. See for example the x86_64 abi spec
[7].

Note that in practise what gcc generates overrides any of the above
specs, and if a discrepancy is found the spec usually gets updated [8].
And that one should be careful about bugs in the old DWARF2 spec [9] and
extension of DWARF specified by the LSB [10] (which mostly augment
DWARF2 to be like DWARF3, at least for the exception handler sections).

If an .eh_frame section is available in an elf file it is guaranteed to
be loaded in memory. But depending on architecture and language being
compiled might not be available at all (and neither might the frame
pointer or the .debug_frame section).

So with that background the work having to be done consists of the
following:

- libunwind officially only supports the .eh_frame format so it will
have to be extended to also support the .debug_frame format. Luckily the
differences, although very poorly documented, don't seem to be that
large.

- libunwind has its own CFI EH/DWARF parser but doesn't come with an
interface to feed/read the CFI information directly. This is the
Gget_unwind_table.c (unw_get_unwind_table) support that Nurdin added,
but which isn't upstream yet. Pushing this upstream would be very
beneficial since then we could use pure upstream in frysk, but see the
next point, maybe the proposed interface should be changed a little.

- Currently we hook into this new unw_get_unwind_table through
UnwindH.hxx (createProcInfoFromElfImage), this is called indirectly
through the libunwind find_proc_info callback which wants to see the
unwind_info filled in. The ElfImage used is created in
UnwindAddressSpace.findProcInfo() through the private method
getElfImage(long address) by getting the MemoryMap of the address from
the Task and either mapping the map from the elf file into memory or if
the section is the VDSO by creating an anonymous mmap and filling that
through reading the address map and then passing it to the libunwind
dwarf reader. Directly mmapping these sections seems wrong here since
the sections should already be available through the memory buffers of
the proc we are inspecting (which might already have mapped in those
sections). So it would be better to use the libunwind addressspace
accessors that go through the ByteBuffers also for this. This might mean
another change in the libunwind interface so all remote memory accesses
go through the same hooks (although unw_get_unwind_table already
provides an unw_address_space as argument, so I might be missing
something).

- For the .debug_info we cannot rely on it being available in the target
address space (unlike .eh_frame which always gets loaded) and it might
not be in the elf file directly, but might be in a separate debuginfo
file. So there we need to locate the section first through libdwfl, load
it (also through libdwfl?) and feed that to libunwind.

Cheers,

Mark

[1] http://en.wikipedia.org/wiki/Frame_pointer
[2] http://en.wikipedia.org/wiki/Function_prologue
[3] http://sourceware.org/gdb/current/onlinedocs/gdbint_3.html#SEC9
[4] http://wiki.dwarfstd.org/ (Dwarf 3 - section 6.4)
[5]
http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
[6] http://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
[7] http://www.x86-64.org/documentation/abi.pdf (Section 3.6 and 3.7)
[8] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32982
[9]
http://wiki.dwarfstd.org/index.php?title=DWARF_FAQ#How_big_is_a_DW_FORM_ref_addr.3F
[10]
http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/dwarfext.html

Follow-Ups:
- Re: Roundtable, breakpoints and lots of unwinding (Was: meeting 2007-08-15 9:30 us east coast time)
  - From: Phil Muldoon

References:
- meeting 2007-08-15 9:30 us east coast time
  - From: Andrew Cagney
- Re: meeting 2007-08-15 9:30 us east coast time
  - From: Andrew Cagney

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]