8 Statistics (aggregates)

Aggregate instances are used to collect statistics on numerical values, when it is important to accumulate new data quickly and in large volume. These instances operate without exclusive locks, and store only aggregated stream statistics. Aggregates make sense only for global variables. They are stored individually or as elements of an associative array. For information about wrapping associative arrays with statistics elements, see section 7.4

8.1 The aggregation (<<<) operator

The aggregation operator is “<<<”, and its effect is similar to an assignment or a C++ output streaming operation. The left operand specifies a scalar or array-index l-value, which must be declared global. The right operand is a numeric expression. The meaning is intuitive: add the given number as a sample to the set of numbers to compute their statistics. The specific list of statistics to gather is given separately by the extraction functions. The following is an example.

     a <<< delta_timestamp
     writes[execname()] <<< count

8.2 Extraction functions

For each instance of a distinct extraction function operating on a given identifier, the translator computes a set of statistics. With each execution of an extraction function, the aggregation is computed for that moment across all processors. The first argument of each function is the same style of l-value as used on the left side of the aggregation operation.

8.3 Integer extractors

The following functions provide methods to extract information about aggregate.

8.3.1 @count(s)

This statement returns the number of samples accumulated in aggregate s.

8.3.2 @sum(s)

This statement returns the total sum of all samples in aggregate s.

8.3.3 @min(s)

This statement returns the minimum of all samples in aggregate s.

8.3.4 @max(s)

This statement returns the maximum of all samples in aggregate s.

8.3.5 @avg(s)

This statement returns the average value of all samples in aggregate s.

8.4 Histogram extractors

The following functions provide methods to extract histogram information. Printing a histogram with the print family of functions renders a histogram object as a tabular ”ASCII art” bar chart.

8.4.1 @hist_linear

The statement @hist_linear(v,L,H,W) represents a linear histogram of aggregate v, where L and H represent the lower and upper end of a range of values and W represents the width (or size) of each bucket within the range. The low and high values can be negative, but the overall difference (high minus low) must be positive. The width parameter must also be positive.

In the output, a range of consecutive empty buckets may be replaced with a tilde (˜) character. This can be controlled on the command line with -DHIST_ELISION=<num>, where <num> specifies how many empty buckets at the top and bottom of the range to print. The default is 2. A <num> of 0 removes all empty buckets. A negative <num> disables removal.

For example, if you specify -DHIST_ELISION=3 and the histogram has 10 consecutive empty buckets, the first 3 and last 3 empty buckets will be printed and the middle 4 empty buckets will be represented by a tilde (˜).

The following is an example.

     global reads
     probe netdev.receive {
         reads <<< length
     }
     probe end {
         print(@hist_linear(reads, 0, 10240, 200))
     }

This generates the following output.

     value |-------------------------------------------------- count
         0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1650
       200 |                                                      8
       400 |                                                      0
       600 |                                                      0
           ~
      1000 |                                                      0
      1200 |                                                      0
      1400 |                                                      1
      1600 |                                                      0
      1800 |                                                      0

This shows that 1650 network reads were of a size between 0 and 199 bytes, 8 reads were between 200 and 399 bytes, and 1 read was between 1200 and 1399 bytes. The tilde (˜) character indicates the bucket for 800 to 999 bytes was removed because it was empty. Empty buckets for 2000 bytes and larger were also removed because they were empty.

8.4.2 @hist_log

The statement @hist_log(v) represents a base-2 logarithmic histogram. Empty buckets are replaced with a tilde (˜) character in the same way as @hist_linear() (see above).

The following is an example.

     global reads
     probe netdev.receive {
         reads <<< length
     }
     probe end {
         print(@hist_log(reads))
     }

This generates the following output.

     value |-------------------------------------------------- count
         8 |                                                      0
        16 |                                                      0
        32 |                                                    254
        64 |                                                      3
       128 |                                                      2
       256 |                                                      2
       512 |                                                      4
      1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16689
      2048 |                                                      0
      4096 |                                                      0

8.5 Deletion

The delete statement (subsection 6.3) applied to an aggregate variable will reset it to the initial empty state.