This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [KPROBE][RFC] Tweak to the function return probe design


Hi Rusty,

This looks good. See my comments below

Rusty Lynch wrote:

From my experiences with adding return probes to x86_64 and ia64, and the
feedback on LKML to those patches, I think we can simplify the design
for return probes.

The following patch tweaks the original design such that:

* Instead of storing the stack address in the return probe instance, the
task pointer is stored. This gives us all we need in order to:
- find the correct return probe instance when we enter the trampoline
(even if we are recursing)


What about the case user probing different functions in the same task? As I and Jim discussed about this, it probably works since you put the RI the hlist and get it out in the same order. But in the case user probing several function-returns but the one in the middle never return, your code would fire the wrong handler.

I thought I would be safer for us to have the "statck_addr" field in kretprobe_instance structure for sanity check. It would not be much code to add, just one line to populate the stack_addr in the arch_prepare_kretprobe() and may be a couple line to do the check in the handler.

- find all left-over return probe instances when the task is going away

 This has the side effect of simplifying the implementation since more
 work can be done in kernel/kprobes.c since architecture specific knowlege
 of the stack layout is no longer required.  Specifically, we no longer have:
	- arch_get_kprobe_task()
	- arch_kprobe_flush_task()
	- get_rp_inst_tsk()
	- get_rp_inst()
	- trampoline_post_handler() <see next bullet>

* Instead of splitting the return probe handling and cleanup logic across
the pre and post trampoline handlers, all the work is pushed into the pre function (trampoline_probe_handler), and then we skip single stepping
the original function. In this case the original instruction to be single
stepped was just a NOP, and we can do without the extra interruption.


The new flow of events to having a return probe handler execute when a target
function exits is:

* At system initialization time, a kprobe is inserted at the beginning of
 kretprobe_trampoline (this has not changed)

* register_kretprobe() will insert a kprobe at the beginning of the targeted
 function with the kprobe pre_handler set to arch_prepare_kretprobe
 (still no change)

* When the target function is entered, the kprobe is fired, calling
 arch_prepare_kretprobe (still no change)

* In arch_prepare_kretprobe() we try to get a free instance and if one is
 available then we fill out the instance with a pointer to the return probe,
 the original return address, and a pointer to the task structure (instead
 of the stack address.)  Just like before we change the return address
 to the trampoline function and mark the instance as used.

If multiple return probes are registered for a given target function,
then arch_prepare_kretprobe() will get called multiple times for the same task (since our kprobe implementation is able to handle multiple kprobes at the same address.) Past the first call to arch_prepare_kretprobe, we end up with the original address stored in the return probe instance
pointing to our trampoline function. (This is a significant difference
from the original arch_prepare_kretprobe design.)


* Target function executes like normal and then returns to kretprobe_trampoline.

* kprobe inserted on the first instruction of kretprobe_trampoline is fired
 and calls trampoline_probe_handler() (no change here)

* trampoline_probe_handler() looks up the newest return probe instance associated with the current task and then:
- calls the registered handler function
- sets the pt_regs instruction pointer back to the original return address
- marks the instance as free
- returns in a way that the single stepping of the original
instruction is skipped


If there were multiple kretprobes registered for the target function, then the original return address would be trampoline_probe_handler, causing
the next instance to be handled until finally the real return address
is restored.


(Huge change)
* If the task is killed with some left-over return probe instances (meaning
that a target function was entered, but never returned), then we just
free any instances associated with the task. (Not much different other
then we can handle this without calling architecture specific functions.)


BUT... I just mark each of the instances as unused without bothering to restore their original return address. The original implementation would restore the return address for each instance (I assume because it was thought that the target function could still be running and could still return.)

On i386 we do this cleanup from process.c:exit_thread() and process.c:flush_thread(). Let me know if I am wrong, but I do not think
at this point in the task lifecycle that it is possible for the the
target functions to continue after this point and get into trouble
because of a wrong return address.


(Significant change)

This patch applies to the 2.6.12-rc6-mm1 kernel, but only for the i386
architecture.  I know this approach will work on x86_64 and ia64, and I
haven't the slightest clue if this is ok for ppc64.

--rusty


@@ -155,8 +148,8 @@ struct kretprobe_instance {
struct hlist_node uflist; /* either on free list or used list */
struct hlist_node hlist;
struct kretprobe *rp;
- void *ret_addr;
- void *stack_addr;
+ unsigned long *ret_addr;


As Ananth suggested to me before, you could use

kprobe_opcode_t * ret_addr;

+	struct task_struct *task;
};

#ifdef CONFIG_KPROBES
@@ -194,8 +187,6 @@ int register_kretprobe(struct kretprobe void unregister_kretprobe(struct kretprobe *rp);






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]