This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Accessing target variables in return probes (was Re: src ChangeLog tapsets.cxx)


On Tuesday, November 21, 2006 11:56 AM, David Smith wrote:
> Stone, Joshua I wrote:
>> * The tvar_ctr needs to be per-thread, else you get a race condition.
>> Consider the simple case where two threads run the same function,
>> interleaving counter access.  (i.e., T1 enters, T2 enters, T1
>> returns, T2 returns).  The threads will end up reading each other's
>> data! 
> 
> Sigh.  I can't believe I missed this since I sat down and wrote out a
> similar situation.  Actually the threads will end up getting zeros
> (the default value).

Ah, yes, didn't quite map that out right in my head -- it will keep the
threads' data separate.  Zero values and mangled counters are still a
problem, of course.

> To do per-thread counters, I think I'll need another map to store the
> counter value for that thread.  So, accessing cached parameter would
> look something like:
> 
> tvar_cached_data[tid(), tvar_ctr[tid()]]
> 
> Does that look like it would work?  Got any better ideas?

That should work just fine.

>> * As a later optimization, consider merging duplicate target-var
>> references into a single temp call.  For example, 'probe
>> syscall.open.return { if ($flags) print($flags) }' right now
>> duplicates the effort for each $flags reference.
> 
> Currently the code to get the actual parameter value gets merged to a
> single function.  You are correct that the parameter caching map calls
> get duplicated.

The dwarf function only exists once, but the new entry probe, the
temporary maps, and the read from the maps are all duplicated.

> This actually might be considered a requirement, since if you have two
> different return probes for the same function that access the same
> cached parameter, we want them to be separate (so the counters are
> handled correctly).

For distinct probes, sure, it's much easier to keep them separate.  We
don't want the headache of separate probes trying to coordinate what
order they access it in, or which one should delete and decrement.  But
when the reference is duplicated within a single probe, it seems like it
would be easy to read the value from the same place, especially if
you're going to read it into a local at the top of the probe anyway.

This sort of issue may crop up more often if scripts get more elaborate
with probe aliasing -- like this intentionally convoluted example:

/* a custom tapset that defines an interesting probe point */
probe sysopen_with_flags = syscall.open.return {
	if (!$flags) next
}

/* a script file that uses it. */
probe sysopen_with_flags {
	printf("sysopen flags:%d = %d\n", $flags, $return)
}

> I'm thinking that it might be possible to inject the code at the
> *beginning* of the function, with the help of a temporary variable.
> Something like this:
> 
> probe kernel.function("foo").return
> {
> { /* start of injected code */
>     tvar_cache_tmp = tvar_cached_data[tid(), tvar_ctr[tid()]]
>     delete tvar_cached_data[tid(), tvar_ctr[tid()]--]
>    /* end of injected code */
> }
> ... rest of original probe ...
> }
> 
> then use 'tvar_cache_tmp' everywhere the target variable was accessed.

That looks good.  You can also cache tid() into a local instead of
paying for the function call four times.

You can probably use some internal bookkeeping to track which variables
you've injected code for, and just reuse the tvar_cached_temp if it's
already there.  This solves the multiple-reference problem above.

> Josh, do you (or anyone else) know if it would be a good idea to
> delete the counter map value when it reaches 0?

No need -- setting a value to zero removes it from the map.

>> * A potential bug: due to the way string targets are read, only the
>> string pointer is being saved for the return probe.  The
>> 'kernel/user_string' call happens in the return probe.  In most cases
>> this is probably ok, but there might be some functions where that
>> string is manipulated during the course of execution.  The cleanest
>> way I can think to solve this is to extend the language to
>> automatically treat a target like a string -- probably using
>> '@mystring' like we do with positional parameters.  Behind the
>> scenes we can look at the pointer range to figure whether to use
>> kernel_string or user_string. 
> 
> Hmm, I wonder if we could use a function for that and force users to
> use it everywhere.  It would be safer and easier.

A function for automatically choosing user_string or kernel_string?
Yeah, that would be a good thing to have.

I still like the idea of a '@var' syntax for target strings too...


Josh


(David will get this twice, as I forgot to CC the list the first time.
Sorry!)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]