This is the mail archive of the frysk@sources.redhat.com mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: From breakpoint addresses to source line stepping


Hi,

Some comments made by various people on irc and during a discussion on
the phone.

On Thu, 2006-09-28 at 16:26 +0200, Mark Wielaard wrote:
> = Bugs and Extensions (low level work to do)
> 
>   - exec call should clears all breakpoints
>     We forget to clear and delete the Code observer in this case.
>     http://sourceware.org/bugzilla/show_bug.cgi?id=3255

This bug has been solved now.

>   - Multiple tasks
>     The current setup is not multi-task safe. When an breakpoint
>     address is hit and we can to continue or step over it the original
>     instruction stream is put back, a step is taking in the Task and
>     the breakpoint instructions are put back. When other Tasks are
>     running this means those Task might miss the breakpoint since they
>     are seeing the original instruction stream. Or worse, they might
>     see an invalid instruction stream with partial breakpoint and
>     partial original instructions in place. This is a problem on
>     architectures that have multibyte breakpoint instruction
>     sequences, like on ppc.
>     There are basically 2 ways to solve this issue:
>     - Stop the world, step, resume world.
>       Whenever an breakpoint address is updated all Tasks of the Proc
>       are suspended first. The original instruction stream is
>       restored. The Task that hit the breakpoint is stepped. The
>       breakpoint instruction is put back. And all Tasks are restarted.
>       This is mostly architecture independent.

A simple first implementation could just be based on the BlockObserver
mentioned below. Then we let the user just iterate over all the Tasks of
a Proc till they are all suspended.

>     - Out of instruction stream stepping.
>       To keep the other Tasks running (suspending/resuming has a lot
>       of overhead) we can try to use 'out of instruction stream'
>       stepping. A per Task local memory location is found to put the
>       original instruction(s) on. We set the PC to this location, a
>       step is performed and the PC is set back. On architectures that
>       support different lenght instructions we need to parse the
>       original instruction stream. And for (jump, load or branch)
>       instructions that are relative to the PC after the step we need
>       to 'fixup' some of the registers before or after the step. The
>       kprobe code in the linux kernel is an example of this approach.
>       This is highly architecture dependent.

djprobes and friends were also mentioned.
http://lwn.net/Articles/157751/
There was recently some discussion of this on the lkml and systemtap
lists. http://sourceware.org/ml/systemtap/2006-q3/msg00511.html
But in an smp environment it seems not yet ready.

>   - Hardware breakpoints
> 
>     When available (often there are not many hardware breakpoint
>     registers) we should use an hardware breakpoint to speed things up
>     and simplify things (no code patching needed!). As an extension an
>     analysis of which breakpoints are hit the most can be done so we
>     use them for those and switch others to less used addresses.

Marking pages as non-executable to force traps and stop threads was also
suggested. This would also be useful for watchpoints. But at the moment
there is no way to manipulate some other process its page-tables (except
by code injection...). Would need support from something like utrace
before we consider this.

>   - Mapping addresses to source lines
>     Done through lib.dw (Dwlf,DwflLine). This uses dwarf information,
>     so can only be done when debug info is available. In theory an
>     address could belong to different source lines (when different
>     contexts are optimized into common code). But in practise this
>     seems to be ignore (unavailable?).

All info should really be available (and generated by gcc).
Documentation is a bit spartan. But there should be enough info in the
libdw header files and the (gigantic) Dwarf spec:
http://dwarf.freestandards.org/Home.php

For now we will assume the availability to .debug dwarf info. If that
isn't the case we can extract some info like function addresses from the
elf sections.

>   - TagSets
>     Maintained in frysk.gui.srcwin.tags are the set of source line
>     tags that the gui is interested in. Currently there isn't a way to
>     define them (except loading them from the preferences). The
>     concept seems useful outside the gui/srcwin package.

This is really throw away code and should most likely just be rewritten.

>   - Mapping TagSets to Task addresses
>     Given a TagSet we need a mechanism for mapping them to breakpoint
>     addresses for each Proc we are interested in. Given the whole
>     system approach that frysk we need a way to map these whenever a
>     new Proc is being observered. Map any core code mapped in to
>     sources which can be mapped against the TagSets. We also need a
>     way to monitor the loading (and unloading) of dynamic libraries

The HPD spec has a good description of "high-level breakpoints" which
are not tied to a process:
http://sourceware.org/frysk/documentation/hpd.html

Monitoring shared libraries is currently not possible through kernel
events. gdb inserts a low level breakpoint in the shared library loader
to catch library loading. Another approach would be the have a syscall
observer for mmap and munmap (which might become practical when through
utrace we can select interest in individual system calls). We can also
do it "lazily" and just reexamine the process maps whenever we get a
chance to see if any new executable maps have been added. This is of
course not exact and might mean early breakpoints in a new library might
be missed.

> = source line stepping, step into, step out off...
> 
>   Given all of the above we can finally implement the functions a user
>   would be interested in given a language model view of the sources.

Part of the high level runtime defs should be merged with the work from
Stan already in the hpd package. This code also gives some examples how
to handle dwarf info. Any model of high-level breakpoints should be
easily integrate with both the gui and the hpd.

Random note. We might want a 'thread object' in the runtime (rt) space
for example to define thread local variables (errno).

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]