This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Review of LKST probe points


Frank Ch. Eigler wrote:
Hi, Will -



I reviewed where LKST puts the various instrumentation points it
provides.  [...]


Thanks!

I looked through the LTT probe points and they are very similar.


[...]
The "*" and "**" means there are some issues that need to be addressed to get probe point, e.g. instrumentation is in inlined function or in the middle of a function.


Inlined functions are not a big problem, assuming dwarf decoding.
Each supplied probe would just get "forked" for each actual inlining
point.  Function interior probe points are likewise made possible via
dwarf, but the task of conveniently specifying the insertion point
remains.  An explicit marker in the object code is of course one
option, but would require kernel changes with all the controversy that
usually entails.

I will take a closer look at the inline functions,particularly the spinlock. Spinlocks are something that people are really interested in. Will probably want more control than instrumenting all the spinlocks or none of the spinlocks. Instrumenting all of them could be pretty expensive. It would make sense to be able to put instrumentation based based on the lock variable, e.g. instrumenat all lock operations for variable "blah". Maybe use the debug registers that monitor a memory location to help with that.



LKST Event
PROCESS_CONTEXTSWITCH	**inline function context_switch, if it wasn't
			** probe kernel.function("context_switch")


In case it's not obvious, it would be reasonable to expose these
standardized tracing points to systemtap as an additional "coordinate
system"

probe lkst(PROCESS_CONTEXTSWITCH) { ... }

so a user does not have to map them explicitly.

Something like the above makes sense. In general there are going to be things like this. Having people know which function to instrument is a level of detail that would like to abstract away. One of the value adds of SystemTap should be that the instrumentation returns data that maps to things that people have control over and/or understand.





[...]
TIMER_ADD		** in middle timer_on(), between spinlock/unlock


Is the precise placement of the probe point within a locking sequence
like this important?  What effect would putting the probe past that
last unlock have?

In some cases the placement of the probe point is going to be important. for example reading data out of a data structure protected by the spinlock/unlock. Another example of this is the example instrumentation for the workqueues.



[...]
/* LKST management events */
LKST_INIT
LKST_KERNEL_DUMP


Can you outline what these are for?

Some of the LKST_ operations are for management of LKST. There events are listed in lkst-events-2.1.pdf. A download can be obtained from


http://sourceforge.net/project/showfiles.php?group_id=41854&package_id=34057

LKST_INIT		mark beginning of trace.
LKST_KERNEL_DUMP	mark when a dump is forced
LKST_MSET_XCHG		mark when lkst mask configuration changed
LKST_BUFF_SHIFT		mark end of this lkst buffer and start of next
LKST_BUFF_OVFLOW	not used
LKST_SYNC_UID		mark change in UID
LKST_SYNC_GID		mark change in GID
LKST_SYNC_PGID		mark change in PGID
LKST_SYNC_TID		mark change in TID
LKST_EXTEND		not used
LKST_EXTENDE		not used
LKST_BUFF_OVWRTN	mark when buffer
LKST_ETYPE_MAX		last event

[...]
Implemented in LKST in inline functions:

LK_SPINLOCK
LK_SPINTRYLOCK
LK_SPINUNLOCK
[...]


I believe the systemtap scripts will need access to such primitives
too, though restricted in such a way that they can't screw up the
system.  For example, a spinlock-protected structure may need a lock
on an SMP host, which in systemtap may get expressed declaratively,
something like this new control structure:

probe foo {
spinlock ($var)
{ /* ... */
}
}


and expanded to a nonblocking construct such as this:

      {
        if (! spin_trylock_irqsave (& VAR, flags))
          { ++ probe_locking_abort; return; }

/* ... */

        spin_unlock_irqrestore (& VAR, flags)
      }

Given debuginfo data, systemtap may be able to infer which of the many
kinds of lock $var is (checking whether its type is spinlock_t, etc.)
and implicitly use the correct underlying functions.  That would allow
use of a generalized "lock" control structure instead of multiple
explicit "spinlock", "semaphorelock", etc. ones.

It would be nice if the probe could be place inside the spinlock region of the instrumented code. This would ensure that data is gathered and minimize perturbations by attempting to get the lock twice.


-Will


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]