This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

language choices for aggregation


Hi -

I can't seem to choose one between a couple of alternative language
constructs to refer to aggregations.

As background, recall that dtrace aggregations are special objects
(usually vectors) whose values track statistics of a given expression.
These statistics can be incrementally computed without locking, making
them more efficient than global integers even to just count events.
It gets even better efficiency-wise when tracking averages,
histograms, etc.

Syntax issues come up for systemtap because of the script language's
implicit typing policy, where declarations are generally absent.
Without declarations, the translator has no way of knowing certain
data collection parameters, particularly with respect to histograms
(range, resolution, linearity).  These parameters need to be known at
translation time for proper variable allocation.  But our variables
are generally undeclared, and all we infer about them at translation
time is their type.  So we need something extra.


The dtrace solution looks like this, as a statement within a probe
handler:

    { ...
    @var = AGGR(expr [,ARGS])
    ... }

where AGGR defines an aggregating function (sum, histogram, ...), and
ARGS would  be the constants for parametrizing histograms.  In this
way, it's both an accumulation operation (the assignment), and a
declaration (exactly what statistics to store with in @var).

The translator could look for these pseudo-declarations in any of
several places.  But which?

(1) We can use the dtrace style of combined aggregation/declaration
(with a distinct operator, to distinguish this accumulation operation
from ordinary assignment):

    probe foo { ... var <<< AGGR (expr [,ARGS])  ... }


(2) Or, duplicating the dtrace limit that such statistics objects must be
global (not probe- or function-local), we could put the declaration
portion into the "global" block:

    global AGGR (var [,ARGS])
    probe foo { ... var <<< expr ... }
    probe end { trace var }

This would allow us to track multiple aggregates (say count, average,
and histogram) of the same value by repetitive declarations:

    global count(var), sum(var), histogram(var,10,0,1000)


(3) Or, the declaration portion could be plopped into
data-*extraction* expressions, something like this:

    global var
    probe foo { ... var <<< expr ... }
    probe end { trace (AGGR (var [,ARGS])) }

This would make such AGGR() calls compulsory, but could allow
script code to extract intermediate values of these statistics
during a probe run, more like an ordinary variable:

    global var
    probe foo { ... var <<< expr ... }
    probe bar { if (avg (var) > 50) { ... } }

But expressing histogram parameters is messier here.


Any suggestions?


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]