This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: architecture paper draft


O.K, in my late night mail yesterday i thought i sent pdf attachment but looks like
i forgot, i am attaching it this time.


Vara Prasad wrote:

Let me try to write my initial thoughts on tapsets.

I am thinking our runtime environment will look like the following

Well i tried drawing the picture using text lines but soon realized by the time you get you wont be able to read hence i am going to attaching a pdf file of the above idea. We can't do everything text easily i guess ):-

The main idea of a tapset as i see is, an expert in a given subsystem knows what is important to understand the inner workings of that subsystem. That expert will export such vital data in the form of a tapset using one or more functions that can be called from a probe. In other words expert will export data and also announce what is the api to get the data, so that every one doesn't have to become expert in all the areas of the kernel yet every one can get the vital data of the system.
Tapsets just like probes can be either PC based or asynchronous timer based. In the case of PC based tapsets
the provided tapset functions get executed when a particular kernel function gets executed, in other words a predefined probe at a given point of execution in the kernel. When a systemtap script uses this particular tapset function all we need to do is activate that probe by registering with kprobes so that kernel will execute the
tapset exported function as the probe handler. For example scheduler expert can export in the scheduler tapset a function called procstatechange(). This function will get executed when the process state changes. In this procstatechange() function expert might decide to export data such as what is the processid, what was the original state and what is the new state through an appropriate data structure.


As Will described below we will have one default tapset called systemtapset. It will have three functions that can be called in the probes which will be init, finish or final and error. The main idea of of this tapset init() function is to initialize the systeamtap specific datastructures that are needed. final() is to dump out all the collected data and to do any processing and cleanup. error() when we find any runtime errors, this can lead to not finish the scripting and unloading the module. For this tapset these functions gets executed during the module initialization, error handling and cleanup.

O.K, it is way past 2 AM, i better go to bed otherwise, i will be sleeping in tomorrow's call. I will continue my thoughts on the tapsets hopefully tomorrow.

William Cohen wrote:

Here is a strawman for specifying locations for instrumentation.

-Will

------------------------------------------------------------------------

Instrumentation Schemes

One of the needs of the instrumentation is to bridge the gap between
the basic mechanisms used to implement the instrumentation and
collecting specific types of data useful for diagnosing
problems. Having to specify hardcode addresses to place probes is
inconvient and error prone. Related types of instrumentation
techniques and operations are grouped together into instrumentation
schemes in SystemTAP.

Probe body has a probe specifier that describes which instrumentation
scheme to use. The probe specifier also contains additional
information to indicate how and where precisely the probe should be
placed.

The syntax of the probe specifier is fairly simple. There may be
additional restrictions due to the details of the instrumentation
scheme, but the following grammar describes the probe specifier
syntax:

probe specifier : "probe" p_spec_list ;

p_spec_list : ( p_spec_elements )+ /* one or more */ ;

p_spec_element : "." name opt_arguments ;

opt_arguments    :
        /* emtpy */
        | "(" string_list ")"
        ;

string_list    : STRING ( "," STRING )*
        ;

instrumentation_scheme : SYMBOL

/* FIXME work ability to separate declaration of probe specifier and use */

Some possible instrumentation schemes:


probe systemtap.(init | fini)


The "systemtap" instrumentation scheme is the most basic. There are
two possible specifier elements: "init" and "fini". probe specifier
"systemtap.init" specifies a probe point that executes before any
other probe in the instrumentation script fires. probe specifier
"systemtap.fini" instruments a point after the last firing of a probe
in the instrumentation script.

"init" and "fini" could be implemented as part of the module
initialization and finalization code.



probe kernel.function(name_list)(/* implicit entry */| .entry | .return)

For probe specifier above can be used to instrument the entry and
return of functions. If no ".entry" or ".return" are include it is
assumed that the function entry will be instrumented.  The function
probe will have the argument list of the function available to it.

The ".return" instruments the code just before the return to the
function that called the instrumented function. Local variables are
not available in this case because the frame for the function has
already been removed. The return value of the ".return" function will
be available.

/* FIXME details on how arguments and return value accessed */



Use of Kernel Data structures for Probe points

Due to the use of modules and devices drivers there are a number of
common data structures that are used to pass lists of methods to the
kernel. These data structures have well definited methods to implement
actions.  The instrumentation could walk these data structures and
extract the location of functions used to implement various
operations.

Below is a proposed probe specifier for the virtual file system.
probe vfs.filesystem(name_list).(file|inode|sb).operation(name_list)(/* implicit entry */| .entry | .return


The underlying kprobe mechanism is still instrumenting functions much
like the "kernel" instrumentation scheme for functions. However, the
addresses of the function are obtained by walking the data structures.
The ".operation" indicates which operation or method should be
instrumented.

Like the function boundary instrumentation of the kernel
instrumentation scheme, arguments will be available on function entry
and return values on function return.

/* FIXME flesh out for other data structures in the kernel */

/* FIXME instrumentation schemes for user space */

probe syscall.operation(name_list)(/* implicit entry */| .entry | .return)

This may be built onto of the audit infrastructure.  Arguments may be
an issue here because they are going to be in user space. Return value
should be available on return.


Other variables available at probe locations









Attachment: block diagram.pdf
Description: Adobe PDF document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]