This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Per-process tracing user-space probes approach
- From: "S. P. Prasanna" <prasanna at in dot ibm dot com>
- To: systemtap at sources dot redhat dot com
- Date: Fri, 9 Jun 2006 11:23:26 +0530
- Subject: Per-process tracing user-space probes approach
- Reply-to: prasanna at in dot ibm dot com
Hi,
I have listed a brief description of the user-space probes approach,
which I am planning to implement.
Please review and provide your comments.
Thanks
Prasanna
Requirements:
- per process tracing using "COW"
- able to trace yet to be started applications
- smallest kernel patch (no aggregate support for this release)
- correlation of kernel and user probes output
- least performance overhead compared to ptrace
- provide clean user interface like syscall with pre-defined
set of handlers that can log data, registers, stack trace etc and
also support adding new handlers at runtime.
- no hooks to readpage(s)
- handler runs in kernel and handler can sleep to collect data from
non-memory resident pages.
- single step out-of-line
Usage:
1. Specifying a new kernel handlers for a probe point.
- usage example.
1. Trace a library routine malloc() in an application already
started pid is 123. A kernel module with uprobe_khandler() is
inserted into the kernel.
To get the address of malloc() use
#objdump -D appln |grep malloc
0x08048320
main(){
pid_t child = 123;
printf("insert probes on child pid %ld\n", child);
utrace(child, 0x8048320, UTRACE_KHANDLER, uprobe_handler);
}
2. Trace a routine foo() in an application yet to be started and
specify a kernel handler utrace_foo_khandler().
First of all a kernel module with utrace_foo_khandler() is
inserted into the kernel.
To get the address of foo() use
#objdump -D appln |grep foo
0x08048fa0
main(){
pid_t child;
if ((child = fork()) == 0) {
utrace(0, 0x08048fa0, UTRACE_KHANDLER, utrace_foo_khandler);
execve(appln, "/home/prasanna/appln");
}
}
2. Specifying already existing kernel handler for a probe point.
- usage example.
1. Trace a library routine malloc() in an application already
started pid is 123 and specify to dump registers.
To get the address of malloc() use
#objdump -D appln |grep malloc
0x08048320
main(){
pid_t child = 123;
printf("insert probes on child pid %ld\n", child);
utrace(child, 0x8048320, UTRACE_GETREGS, NULL);
}
2. Trace a routine foo() in an application yet to be started and
specify to dump registers.
To get the address of foo() use
#objdump -D appln |grep foo
0x08048fa0
main() {
pid_t child;
if ((child = fork()) == 0) {
utrace(0, 0x08048fa0, UTRACE_GETREGS, NULL);
execve(appln, "/home/prasanna/appln");
}
}
Issues:
1. Is it acceptable to allow the user to specify a kernel
routine through a syscall that will be executed, when the probe
point gets hit ?
2. Is it acceptable to run the instrumentation code as part of kernel
address space ?
3. Are there any security concerns ?
Interfaces:
int sys_utrace(pid_t pid, unsigned long vaddr,
unsigned long request, char *name);
pid - process id that need to probed.
vaddr - virtual address where probe is to be inserted.
request - UTRACE_GETDATA, UTRACE_GETREGS, UTRACE_STACKTRACE.
name - name of the kernel handler.
maybe _add_ length field as well, so that user can specify length of data
to be logged.
void sys_utrace_rm(pid_t pid, unsigned long vaddr);
Data structures:
Allocated for each probe.
struct uprobe {
/*per process and per probe hlist_node */
struct hlist_node plist;
unsigned long request; /* bitmap of the request */
unsigned long status; /* status as active/inactive */
struct kprobe kp; /* kprobe structure */
};
Allocated for each process.
struct uprobe_module {
struct hlist_head phead; /* list of all probed processes */
/* list of all probes for individual process */
struct hlist_node mlist;
struct pid_t pid; /* pid of the each probed process */
};
uprobe_table[]; /* individual probes hashed on vaddr * pid */
struct hlist_head uprobe_module_head[];
/* list of all uprobe_module hashed on pid*/
uprobe_mutex /* protect uprobe_table and uprobe_module_table*/
--
S.P. Prasanna
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-41776329