This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: architecture paper draft


Vara has identified two basic kinds of taps:
- program-counter based
- interrupt based

These makes sense from the perspective of the Systemtap 
implementor. From the perspective of the script writer, 
they will likely be thinking about source-code constructs 
rather than machine code:
- procedures (before, after, etc.). This is the "systemtapset", right?
- source-code features (call sites, line numbers, labels)
- software events (synchronous)
- hardware events (asynchronous)
I think the first three would be program-counter based and
the fourth would be interrupt based. So part of the design 
ought to be considering both the implementation perspective 
and the constructs that script writers would think about.

A third type I'd add to Vara's list is "compound taps".
For example, a subset of scheduler-related taps from the 
systemtapset might be used as a part of the scheduler
tap set. You might also have a compound tap that was a
existing tap plus a condition, such as a procstatechange()
event when NewState==PROC_READY.

Brad

-----Original Message-----
From: systemtap-owner@sources.redhat.com
[mailto:systemtap-owner@sources.redhat.com] On Behalf Of Vara Prasad
Sent: Thursday, March 31, 2005 7:31 AM
To: Vara Prasad
Cc: William Cohen; systemtap@sources.redhat.com; Frank Ch. Eigler
Subject: Re: architecture paper draft

O.K, in my late night mail yesterday i thought i sent pdf attachment but

looks like
i forgot, i am attaching it this time.

Vara Prasad wrote:

> Let me try to write my initial thoughts on tapsets.
>
> I am thinking our runtime environment will look like the following
>
>  
> Well i tried drawing the picture using text lines but soon realized by

> the time you get you wont be able to read hence i am going to 
> attaching a pdf file of the above idea. We can't do everything text 
> easily i guess ):-
>
> The main idea of a tapset as i see is, an expert in a given subsystem 
> knows what is important to understand the inner workings of that 
> subsystem. That expert will export such vital data in the form of a 
> tapset using one or more functions that can be called from a probe. In

> other words expert will export data and also announce what is the api 
> to get the data, so that every one doesn't have to become expert in 
> all the areas of the kernel yet every one can get the vital data of 
> the system.
> Tapsets just like probes can be either PC based or asynchronous timer 
> based. In the case of PC based tapsets
> the provided tapset functions get executed when a particular kernel 
> function gets executed, in other words a predefined probe at a given 
> point of execution in the kernel. When a systemtap script uses this 
> particular tapset function all we need to do is activate that probe by

> registering with kprobes so that kernel will execute the
> tapset exported function as the probe handler. For example scheduler 
> expert can export in the scheduler tapset a function called 
> procstatechange(). This function will get executed when the process 
> state changes. In this procstatechange() function expert might decide 
> to export data such as what is the processid, what was the original 
> state and what is the new state through an appropriate data structure.
>
> As Will described below we will have one default tapset called 
> systemtapset. It will have three functions that can be called in the 
> probes which will be  init, finish or final and error.  The main idea 
> of of this tapset init() function is to initialize the systeamtap 
> specific datastructures that are needed. final() is to dump out all 
> the collected data and to do any processing and cleanup. error() when 
> we find any runtime errors, this can lead to not finish the scripting 
> and unloading the module. For this tapset these functions gets 
> executed during the module initialization, error handling and cleanup.
>
> O.K, it is way past 2 AM, i better go to bed otherwise, i will be 
> sleeping in tomorrow's call. I will continue my thoughts on the 
> tapsets hopefully tomorrow.
>
> William Cohen wrote:
>
>> Here is a strawman for specifying locations for instrumentation.
>>
>> -Will
>>
>>
------------------------------------------------------------------------
>>
>> Instrumentation Schemes
>>
>> One of the needs of the instrumentation is to bridge the gap between
>> the basic mechanisms used to implement the instrumentation and
>> collecting specific types of data useful for diagnosing
>> problems. Having to specify hardcode addresses to place probes is
>> inconvient and error prone. Related types of instrumentation
>> techniques and operations are grouped together into instrumentation
>> schemes in SystemTAP.
>>
>> Probe body has a probe specifier that describes which instrumentation
>> scheme to use. The probe specifier also contains additional
>> information to indicate how and where precisely the probe should be
>> placed.
>>
>> The syntax of the probe specifier is fairly simple. There may be
>> additional restrictions due to the details of the instrumentation
>> scheme, but the following grammar describes the probe specifier
>> syntax:
>>
>> probe specifier : "probe" p_spec_list ;
>>
>> p_spec_list    : ( p_spec_elements )+ /* one or more */ ;
>>
>> p_spec_element    : "." name opt_arguments ;
>>
>> opt_arguments    :
>>         /* emtpy */
>>         | "(" string_list ")"
>>         ;
>>
>> string_list    : STRING ( "," STRING )*
>>         ;
>>
>> instrumentation_scheme : SYMBOL
>>
>> /* FIXME work ability to separate declaration of probe specifier and 
>> use */
>>
>> Some possible instrumentation schemes:
>>
>>
>> probe systemtap.(init | fini)
>>
>> The "systemtap" instrumentation scheme is the most basic. There are
>> two possible specifier elements: "init" and "fini". probe specifier
>> "systemtap.init" specifies a probe point that executes before any
>> other probe in the instrumentation script fires. probe specifier
>> "systemtap.fini" instruments a point after the last firing of a probe
>> in the instrumentation script.
>>
>> "init" and "fini" could be implemented as part of the module
>> initialization and finalization code.
>>
>>
>>
>> probe kernel.function(name_list)(/* implicit entry */| .entry |
.return)
>>
>> For probe specifier above can be used to instrument the entry and
>> return of functions. If no ".entry" or ".return" are include it is
>> assumed that the function entry will be instrumented.  The function
>> probe will have the argument list of the function available to it.
>>
>> The ".return" instruments the code just before the return to the
>> function that called the instrumented function. Local variables are
>> not available in this case because the frame for the function has
>> already been removed. The return value of the ".return" function will
>> be available.
>>
>> /* FIXME details on how arguments and return value accessed */
>>
>>
>>
>> Use of Kernel Data structures for Probe points
>>
>> Due to the use of modules and devices drivers there are a number of
>> common data structures that are used to pass lists of methods to the
>> kernel. These data structures have well definited methods to
implement
>> actions.  The instrumentation could walk these data structures and
>> extract the location of functions used to implement various
>> operations.
>>
>> Below is a proposed probe specifier for the virtual file system.
>> probe 
>> vfs.filesystem(name_list).(file|inode|sb).operation(name_list)(/* 
>> implicit entry */| .entry | .return
>>
>> The underlying kprobe mechanism is still instrumenting functions much
>> like the "kernel" instrumentation scheme for functions. However, the
>> addresses of the function are obtained by walking the data
structures.
>> The ".operation" indicates which operation or method should be
>> instrumented.
>>
>> Like the function boundary instrumentation of the kernel
>> instrumentation scheme, arguments will be available on function entry
>> and return values on function return.
>>
>> /* FIXME flesh out for other data structures in the kernel */
>>
>> /* FIXME instrumentation schemes for user space */
>>
>> probe syscall.operation(name_list)(/* implicit entry */| .entry | 
>> .return)
>>
>> This may be built onto of the audit infrastructure.  Arguments may be
>> an issue here because they are going to be in user space. Return
value
>> should be available on return.
>>
>>
>> Other variables available at probe locations
>>
>>
>>  
>>
>
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]