This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

health monitoring scripts


Hi -

I ask asked to share some snippets of an old idea regarding a possible
compelling application for systemtap.  Here goes, from a few months
back:

------------------------------------------------------------------------

The technical gist of the idea would have several parts: to create a
suite of systemtap script fragments (a new "health" tapset); and to
one or more front-ends for monitoring the ongoing probes graphically
and via other tools.

Each tapset piece would represent a single subsystem or concern.  A
variety of probes could act to represent a view of its health
(tracking allocation counts, rates of activity, latency trends,
whatever makes sense).  The barest sketch ...

   $ cat tapset/health/resource-levels
   global _resource_levels
   global _resource_min
   global _resource_max

   $ cat tapset/health/resourceX-simple.stp
   probe begin { _resource_min["resourceX"]=100 
                 _resource_max["resourceX"]=999999 }
   probe kernel.function("resourceX_init")
         { _resource_levels["resourceX"] = $value }
   probe kernel.function("resourceX_alloc")
         { _resource_levels["resourceX"] -- }
   probe kernel.function("resourceX_free")
         { _resource_levels["resourceX"] ++ }
   probe health.resourceX = never {}

   $ cat tapset/health/resourceY-markers.stp
   probe begin { _resource_min["resourceY"]=1 
                 _resource_max["resourceY"]=1000
                 _resource_levels["resourceY"]=_rY_currentValue() }
   # pull out starting value via embedded C
   function _rY_currentValue:long () %{
         THIS->__retvalue = some_kernel_function(); %}
   probe kernel.mark("resourceY_alloc")
         { _resource_levels["resourceY"] -= $arg1 }
   probe kernel.mark("resourceY_free")
         { _resource_levels["resourceY"] += $arg1 }
   probe health.resourceY = never {}

   $ cat tapset/health/resourceZ-rate.stp
   probe begin { _resource_min["resourceZ"]=0 }
   probe begin { _resource_max["resourceZ"]=10000 }
   global _resourceZ_occur
   probe kernel.mark("eventZ") { _resourceZ_occur <<< 1 }
   probe health.resourceZ = timer.s(1) /* 1-second average */ {
         _resource_levels["resourceZ"] = @count(_resourceZ_occur)
         delete _resourceZ_occur
   }

   $ ls tapset/health/*.stp  # figments of my imagination
   resourceX-simple.stp
   resourceY-markers.stp
   resourceZ-rate.stp
   network-buffers.stp
   page-pool.stp
   hw-interrupt-rate.stp
   process-count.stp
   ...

   $ cat resource-monitor.stp
   probe $1 {}
   probe timer.s(10) {
         foreach(resource in _resource_level)
                printf("%d,%s,%s\n",
                        gettimeofday_s(),
                        resource, _resource_level[resource])
   }

   $ cat resource-alarm.stp
   probe $1 {}
   probe timer.s(10) {
         foreach(resource in _resource_level) {
                if (_resource_level[resource] < _resource_min[resource])
                        printf("%d,%s,%s,LOW\n",                   
                               gettimeofday_s(),
                               resource, _resource_level[resource])
                else if (_resource_level[resource] > _resource_max[resource])
                        printf("%d,%s,%s,HIGH\n",                   
                               gettimeofday_s(),
                               resource, _resource_level[resource]) 
                }
   }


And here's how to use it:
   
   # stap resource-monitor.stp 'health.resourceX'
   1234000,resourceX,30
   1234030,resourceX,40
   1234060,resourceX,50

This could go straight into a graphical trace viewer such as timoore
is starting to work on.

   # stap resource-alarm.stp 'health.*'
   1237030,resourceX,40,HIGH
   1238050,resourceZ,50,LOW

This could go into syslog/etc. notifications.


So what's neat here?

All this code can be running in the kernel in the background.
Different subsets of resources can be monitored / alarmed
concurrently.  Resources not selected for monitoring would cause no
space/time penalties.

A new resource can be defined by installing one extra script file (the
"tapset/resourceX-simple.stp" file above.  None of the user-facing
front-ends would need to be changed.  A resource can span whatever
systemtap can probe - including value/rate/trend statistics derived
from probing e.g.  userspace ("number of threads started per unit
time") A kernel/other subsystem maintainer could encode his knowledge
of a variety of thresholds/constriants into script form.

If the resource-tracking code is based on tracepoints/markers instead
of kprobes, then there should be no performance concerns about the
expense of ongoing resource tracking.  Systemtap would also work
without debuginfo for these.  "health-monitoring" could be a simple
and intuitive enough idea for kernel subsystem people to feel
motivated to provide tracepoints for.


What do you think?


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]