This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Let me try to write my initial thoughts on tapsets.
I am thinking our runtime environment will look like the following
Well i tried drawing the picture using text lines but soon realized by the time you get you wont be able to read hence i am going to attaching a pdf file of the above idea. We can't do everything text easily i guess ):-
The main idea of a tapset as i see is, an expert in a given subsystem knows what is important to understand the inner workings of that subsystem. That expert will export such vital data in the form of a tapset using one or more functions that can be called from a probe. In other words expert will export data and also announce what is the api to get the data, so that every one doesn't have to become expert in all the areas of the kernel yet every one can get the vital data of the system.
Tapsets just like probes can be either PC based or asynchronous timer based. In the case of PC based tapsets
the provided tapset functions get executed when a particular kernel function gets executed, in other words a predefined probe at a given point of execution in the kernel. When a systemtap script uses this particular tapset function all we need to do is activate that probe by registering with kprobes so that kernel will execute the
tapset exported function as the probe handler. For example scheduler expert can export in the scheduler tapset a function called procstatechange(). This function will get executed when the process state changes. In this procstatechange() function expert might decide to export data such as what is the processid, what was the original state and what is the new state through an appropriate data structure.
As Will described below we will have one default tapset called systemtapset. It will have three functions that can be called in the probes which will be init, finish or final and error. The main idea of of this tapset init() function is to initialize the systeamtap specific datastructures that are needed. final() is to dump out all the collected data and to do any processing and cleanup. error() when we find any runtime errors, this can lead to not finish the scripting and unloading the module. For this tapset these functions gets executed during the module initialization, error handling and cleanup.
O.K, it is way past 2 AM, i better go to bed otherwise, i will be sleeping in tomorrow's call. I will continue my thoughts on the tapsets hopefully tomorrow.
William Cohen wrote:
Here is a strawman for specifying locations for instrumentation.
-Will
------------------------------------------------------------------------
Instrumentation Schemes
One of the needs of the instrumentation is to bridge the gap between the basic mechanisms used to implement the instrumentation and collecting specific types of data useful for diagnosing problems. Having to specify hardcode addresses to place probes is inconvient and error prone. Related types of instrumentation techniques and operations are grouped together into instrumentation schemes in SystemTAP.
Probe body has a probe specifier that describes which instrumentation scheme to use. The probe specifier also contains additional information to indicate how and where precisely the probe should be placed.
The syntax of the probe specifier is fairly simple. There may be additional restrictions due to the details of the instrumentation scheme, but the following grammar describes the probe specifier syntax:
probe specifier : "probe" p_spec_list ;
p_spec_list : ( p_spec_elements )+ /* one or more */ ;
p_spec_element : "." name opt_arguments ;
opt_arguments : /* emtpy */ | "(" string_list ")" ;
string_list : STRING ( "," STRING )* ;
instrumentation_scheme : SYMBOL
/* FIXME work ability to separate declaration of probe specifier and use */
Some possible instrumentation schemes:
probe systemtap.(init | fini)
The "systemtap" instrumentation scheme is the most basic. There are two possible specifier elements: "init" and "fini". probe specifier "systemtap.init" specifies a probe point that executes before any other probe in the instrumentation script fires. probe specifier "systemtap.fini" instruments a point after the last firing of a probe in the instrumentation script.
"init" and "fini" could be implemented as part of the module initialization and finalization code.
probe kernel.function(name_list)(/* implicit entry */| .entry | .return)
For probe specifier above can be used to instrument the entry and return of functions. If no ".entry" or ".return" are include it is assumed that the function entry will be instrumented. The function probe will have the argument list of the function available to it.
The ".return" instruments the code just before the return to the function that called the instrumented function. Local variables are not available in this case because the frame for the function has already been removed. The return value of the ".return" function will be available.
/* FIXME details on how arguments and return value accessed */
Use of Kernel Data structures for Probe points
Due to the use of modules and devices drivers there are a number of common data structures that are used to pass lists of methods to the kernel. These data structures have well definited methods to implement actions. The instrumentation could walk these data structures and extract the location of functions used to implement various operations.
Below is a proposed probe specifier for the virtual file system.
probe vfs.filesystem(name_list).(file|inode|sb).operation(name_list)(/* implicit entry */| .entry | .return
The underlying kprobe mechanism is still instrumenting functions much like the "kernel" instrumentation scheme for functions. However, the addresses of the function are obtained by walking the data structures. The ".operation" indicates which operation or method should be instrumented.
Like the function boundary instrumentation of the kernel instrumentation scheme, arguments will be available on function entry and return values on function return.
/* FIXME flesh out for other data structures in the kernel */
/* FIXME instrumentation schemes for user space */
probe syscall.operation(name_list)(/* implicit entry */| .entry | .return)
This may be built onto of the audit infrastructure. Arguments may be an issue here because they are going to be in user space. Return value should be available on return.
Other variables available at probe locations
Attachment:
block diagram.pdf
Description: Adobe PDF document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |