This is the mail archive of the archer@sourceware.org mailing list for the Archer project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Syscall tracing

From: Roland McGrath <roland at redhat dot com>
To: Thiago Jung Bauermann <bauerman at br dot ibm dot com>
Cc: Project Archer <archer at sourceware dot org>
Date: Wed, 6 Aug 2008 20:24:51 -0700 (PDT)
Subject: Re: Syscall tracing
References: <m363qtg0ey.fsf@fleche.redhat.com><1217241929.20562.12.camel@dijkstra.wildebeest.org><m3myk079tt.fsf@fleche.redhat.com><1217534254.5915.2.camel@localhost.localdomain><e394668d0807311728r270754b9w2d640053bc73b12c@mail.gmail.com><20080801013834.GA7180@caradoc.them.org><20080801022821.14F1D15427E@magilla.localdomain><1217961799.4534.8.camel@localhost.localdomain>

> Ok, let me see if I understood this correctly...

Sorry, I didn't explain it very fully.  I'll try again below.

> So with your approach there's no need to augment DWARF with special
> attributes like Daniel suggested, right? The semantic intelligence would
> be in the pretty-printer...

I don't think I said anything for or against DWARF extensions.  The only
things I recall Daniel suggesting had to do with interpreting values.  My
main point is that all of that is in the realm of what should be done in
just the same ways as how we interpret values from other (non-kernel) ABIs.

> This means that we would try to convince the kernel folks to provide
> Python pretty-printers to GDB, right?

That is a separate question.  It's not really a technical issue.
Certainly for the foreseeable future, we're talking about developing
some data and the tools that use it at the same time.  We will have to
be at the point of being quite sure how everything is done and having
worked with good data for a while before we get to the subject of
maintenance responsibility.  Let's not get into it now.  We know 100%
it's going to be us doing it until we have a toolset that works.


Getting back to the technical picture, here's how I look at the whole thing.

When we're inspecting a stopped thread, there are two kinds of state it
can be in.

The first kind is that it's somewhere in user code.  We call this its
"innermost frame".  The register state from ptrace tells us the PC.
We look that PC up to get these things (mostly from DWARF):
1. unwind info from that state back to its caller
2. name of the function
3. return type of the function
4. location of each parameter (and local)
5. name and type of each parameter (and local)

The second kind is that it's in a syscall stop.  Here the register
state tells us the system call number.  We need to get from there to
knowing those same four things:
1. unwind info back to its caller
   -> no-op: the registers from ptrace are the "caller's" registers
2. name of the system call
3. return type of the system call
4. location of each parameter (there are no locals)
   -> fixed ABI per machine, list of registers used for positional parameters
5. name and type of each parameter

What I feel mostly strongly about is that the representation of "type"
as mentioned for #3 and #5, and everything having to do with that,
must be unified with what's used for programming language data types.
Supplying the information about the types used in the syscall ABI is
no different than doing it for some normal library's ABI.

What I find the natural way to look at a syscall stop is as a special
kind of "innermost frame".  Its "caller frame" is the normal innermost
frame derived from the registers.  At the before-syscall stop, it's as
if you're at the entry point of a one-instruction leaf function that
just returns.  You can see the name and all the arguments with names,
types, and locations.  You can change the argument values at those
locations just like the arguments of a real function frame.  The only
other thing you can do is "finish", which means the same as "step"
here.  

The after-syscall stop is like an implicit "finish".  You can see the
results in the return value register as the caller will see it.  You
are stopped right after the return but before the caller's next insn.

What I think wise is that we look at the problem as acquiring items
1-5 from a syscall "virtual innermost frame".  Everything else after
that should use mechanisms that are not specific to syscalls.

So, for a real frame, we look up 1-5 based on the PC.  When we have
type info, we've found a DWARF DIE (usually DW_TAG_subprogram) from
that PC that gives us name, type, and all parameter info (2-5).

For a syscall virtual frame, #1 is implicit.  DWARF readily encodes
all of #2-#5 as a DIE, we just need a way to find the DIE associated
with a syscall number rather than with a PC.  

#4 is also implicit: the locations of each positional parameter is a
fixed part of the ABI.  But it's also easy to have a DIE that
represents a syscall give an explicit location attribute for each
parameter like a normal function would.  This might be preferable for
some funny cases e.g. where two ABI syscall arguments (32-bit registers)
actually form one 64-bit parameter to a syscall.  A location expression
can describe that case already, without any syscall-related special case.

For looking up a DIE by syscall number, one could add some DWARF
extension for this.  Or one can just abuse DWARF in some way with
nothing new in the format per se.  This is certainly the easiest way
to get started on a prototype.  The debugger is going to be loading
this file of DWARF specially anyway, and know that it's not associated
with any addresses, symbols, or sources in the actual program.  So it
can treat various sorts of numbers and names in this DWARF file as a
special kludge instead of their normal meaning.

For example, emit a normal DW_TAG_subprogram DIE for a syscall,
but give it a DW_AT_entry_pc containing its syscall number.
(It's as if this DWARF object describes a separate address space of
syscall numbers rather than memory addresses.)  

Something like that seems most natural in the long run, when we are
using tailored tools to generate the DWARF.  For immediate prototyping
the easy thing is to use plain old gcc to generate DWARF that has most
of the right info, and just kludge it however is easy.  For example:

	#line 37 "syscall"
	unsigned int alarm (unsigned int seconds) {}
	#line 39 "syscall"
	int getpid (void) {};

compile that with gcc -g -c, and then look at that DWARF object.
To consider the register state at the before-syscall stop, take
the syscall number--say it's 37.  You can look up the function at 
source file "syscall", line number 37 in the special DWARF symfile.
The call name, return type, and argument names are right there as
if it were a real function.  

You could of course also just use a table of numbers to names
and look up the DIEs by name in the DWARF, ignoring the bogus
PC and source line info entirely.

Using this gcc hack to generate the DWARF, you don't get location
expressions you can use for the parameters.  So you'd have to just do
the fixed machine ABI mapping for each positional parameter to its
location in a register.  But you could fake that in at a fairly low
level and then replace it later with using the DWARF expressions
when we have a proper clean way to generate the DWARF.


Everything else is just like a handling normal function entry
breakpoint and its arguments' data types.  I do have a variety of
thoughts about that.  DWARF extensions having to do with clever
display are certainly not out of the question in my mind.  But
whatever solutions are done in that area I consider to be part of the
whole "pretty printing" (et al) issue and not particular to syscalls.

For the consideration of syscalls, I'd like to take it just up to the
point of establishing the #1-#5 info items in DWARF or DWARF-esque form,
and then draw a line under it.  Beyond that, there is just the general
discussion on sexy information display/inspection.  When everything new
and pretty exists, syscall handling has just some simple choices on UI
tie-in left.


Thanks,
Roland

Follow-Ups:
- Re: Syscall tracing
  - From: Tom Tromey

References:
- Tasks
  - From: Tom Tromey
- Re: Tasks
  - From: Mark Wielaard
- Re: Tasks
  - From: Tom Tromey
- Re: Tasks
  - From: Thiago Jung Bauermann
- Re: Tasks
  - From: Doug Evans
- Syscall tracing
  - From: Daniel Jacobowitz
- Re: Syscall tracing
  - From: Roland McGrath
- Re: Syscall tracing
  - From: Thiago Jung Bauermann

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]