This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Per-process tracing user-space probes approach


S. P. Prasanna wrote:

Hi,

I have listed a brief description of the user-space probes approach,
which I am planning to implement.
Please review and provide your comments.

Thanks
Prasanna

Requirements:

- per process tracing using "COW"
- able to trace yet to be started applications


I would think this is lower in priority.

- smallest kernel patch (no aggregate support for this release)
- correlation of kernel and user probes output
- least performance overhead compared to ptrace
- provide clean user interface like syscall with pre-defined
 set of handlers that can log data, registers, stack trace etc and
 also support adding new handlers at runtime.
- no hooks to readpage(s)
- handler runs in kernel and handler can sleep to collect data from
 non-memory resident pages.
- single step out-of-line



If i summarize your proposal there are going to be two steps one to define the handler and second to associate a handler to the break point. For user space probes there are going to be two types of handlers some predefined handlers (similar to ptrace) that will be pre-built into the kernel and users also have an ability to add more handlers in the kernel. New handlers are registered with the kernel similar to kprobes handlers but they won't be associated with a beak point at the time of registration. Assosiation to the break point is done with a new systemcall.

Usage:

1. Specifying a new kernel handlers for a probe point.
	- usage example.
	   1. Trace a library routine malloc() in an application already
	      started pid is 123. A kernel module with uprobe_khandler() is
	      inserted into the kernel.

	    To get the address of malloc() use
	    #objdump -D appln |grep malloc
	    0x08048320
	    main(){
		pid_t child = 123;

		printf("insert probes on child pid %ld\n", child);
	        utrace(child, 0x8048320, UTRACE_KHANDLER, uprobe_handler);
		}

2. Trace a routine foo() in an application yet to be started and
specify a kernel handler utrace_foo_khandler().
First of all a kernel module with utrace_foo_khandler() is
inserted into the kernel.


You mean utrace_foo_khandler() needs to be pre-installed through a module load before running the following program, right.

	       To get the address of foo() use
	       #objdump -D appln |grep foo
	       0x08048fa0

	       main(){
			pid_t child;
			if ((child = fork()) == 0) {
				utrace(0, 0x08048fa0, UTRACE_KHANDLER, utrace_foo_khandler);
				execve(appln, "/home/prasanna/appln");
			}
		}

2. Specifying already existing kernel handler for a probe point.
	- usage example.
	   1. Trace a library routine malloc() in an application already
	      started pid is 123 and specify to dump registers.

	    To get the address of malloc() use
	    #objdump -D appln |grep malloc
	    0x08048320
	    main(){
		pid_t child = 123;

		printf("insert probes on child pid %ld\n", child);
	        utrace(child, 0x8048320, UTRACE_GETREGS, NULL);

}

	    2. Trace a routine foo() in an application yet to be started and
	       specify to dump registers.

	       To get the address of foo() use
	       #objdump -D appln |grep foo
	       0x08048fa0

	       main() {
			pid_t child;

			if ((child = fork()) == 0) {
				utrace(0, 0x08048fa0, UTRACE_GETREGS, NULL);
				execve(appln, "/home/prasanna/appln");
			}
		}

Issues:

1. Is it acceptable to allow the user to specify a kernel
routine through a syscall that will be executed, when the probe
point gets hit ?


I don't think we are going to allow any arbitrary existing kernel function here, right. We are only going to allow associating a registered uprobes handler to the break point. Granted uprobe handler can be complex and call an existing kernel function.
The second point about this is we are already allowing this for kernel anyway, so what we are allowing additionally by doing this in response to a break point in the user space.


2. Is it acceptable to run the instrumentation code as part of kernel
address space ?


We are already running instrumentation code in the kernel for kprobes. We are exploring possibilities of running it in user land but the performance penalty seems to be too high. Roland who is familiar with this area may have some additional comments.

3. Are there any security concerns ?


I am not a security expert but Roland who is on the cc may have some insights.

Interfaces:

int sys_utrace(pid_t pid, unsigned long vaddr,
unsigned long request, char *name);


May be a different name more like sys_utrace_addbp.

pid - process id that need to probed.
vaddr - virtual address where probe is to be inserted.
request - UTRACE_GETDATA, UTRACE_GETREGS, UTRACE_STACKTRACE.


What does GETDATA do?
Stack trace here gives the stack of the user process until it hit the break point, right?


name - name of the kernel handler.

maybe _add_ length field as well, so that user can specify length of data
to be logged.


I am not sure i see the value of length field when user is not specifying the buffer where the data is being logged.

void sys_utrace_rm(pid_t pid, unsigned long vaddr);


similarly sys_utrace_rmbp.

Data structures:

Allocated for each probe.
struct uprobe {
	/*per process and per probe hlist_node */
	struct hlist_node plist;
	unsigned long request;		/* bitmap of the request */
	unsigned long status;		/* status as active/inactive */
	struct kprobe kp;		/* kprobe structure */
};

Allocated for each process.
struct uprobe_module {
	struct hlist_head phead;	/* list of all probed processes */
	/* list of all probes for individual process */
	struct hlist_node mlist;
	struct pid_t pid;		/* pid of the each probed process */
};

uprobe_table[]; /* individual probes hashed on vaddr * pid */
struct hlist_head uprobe_module_head[];
/* list of all uprobe_module hashed on pid*/
uprobe_mutex /* protect uprobe_table and uprobe_module_table*/


I think you need to mention the interface how one can register new handlers for user space probes?



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]