This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
my notes from the tracing workshop
- From: Andrew Cagney <cagney at redhat dot com>
- To: systemtap at sourceware dot org
- Cc: frysk <frysk at sourceware dot org>
- Date: Fri, 01 Feb 2008 11:36:50 -0500
- Subject: my notes from the tracing workshop
[The slides get published next week]
Overview
The underlying goal of the workshop was to gather information on the
current state of tracing and monitoring technology, and identify areas
of potential research and development. The Canadian Government is
looking to significantly further research in this area; and is preparing
a report.
Broadly the talks had an embedded bent, which isn't surprising given its
organizational origins in the telco industry. There was a wide level of
representation though with both large system, and deeply embedded
viewpoints being presented.
The Technology
For most talks, the assumed approach was
<probe> -> <filtering> -> <recorder> -> $LOG
then on the host; or in user land:
$LOG -> <converter> -> "DB" -> <visualization>
so I'll talk to that.
Probes
That there were two technology camps (modified kernel, and dynamic
probes), with the majority in the former group. Interestingly, the
embedded players strongly indicated that deploying the modified kernel
was acceptable (even advantageous) - the systems were permanently
running in flight-recorder mode so they were in a better position to do
postmortem analysis.
The exceptions were SystemTAP and SensorPoint (Wind River) (and on the
edge, frysk). Both SystemTAP and SensorPoint and the same basic
approaches. SensorPoint did have a djprobe like mechanism working, and
nested(?) probes (where you could specify the call chain required to
trigger the probe - it worked by watching the functions and not by
looking at backtraces); finally the ability to replace code on live systems.
Finaly, the big and positive thing on probes was that the kernel markers
being accepted. Oracle(Elena) identified that a lacking feature was
being able to query the list of possible probe points -> embedding
markers in the code (and hopefully having them documented in situ ????)
will address this. On the other hand, I picked up a few concerns
(outside of presentations): who gets to back port this (if at all); its
an ABI, who gets to maintain it long term; and what happens when someone
refuses to accept markers in their code :-)
Filters
This is where SystemTAP and SensorPoint stood out (I think :-). Both
have the ability to filter events before pushing them to the recorder.
Using SystemTAP on the kernel markers should be a wicked combination.
[Can I assume that, when there's a marked up kernel, SystemTAP inserts
jumps instead of traps? If fche had been giving the talk, it would have
been my question :-)]
Recorders and logs
Zzzzz.
Converters
The consistent approach was to implement some sort of converter that
could load random external file formats and load them into an internal form.
While there seemed to be a push to standardize on log-file format, I got
the impression that it was solving the wrong problem (and others two).
Size really did matter.
"DB"
There was a strong consensus that the "internal" format of the log data
needed to be a fast light weight database; two vendors were using sqlite
for instance (TPTP the eclipse tool didn't but I suspect will shortly).
Wind River presented a discussion illustrating its advantages.
There were suggestions, and it appears a strong degree of consensus, of
standardizing a database format, so that could be shared amongst
visualization tools. I think this, and the conversion tools will gather
traction. Something SystemTAP should monitor.
Visualization.
Many visualization tools were presented (if I see another useless
full-screen snap-shot in a slide I'll scream), most built on eclipse,
but a few were not. While this is a very crowded market, there seems,
in mnsho, to still be a need for clear simple visualization tools backed
by a databse.
The quote of the day, in describing eclipse, has to be "icon diarrhea".
A few of the Talks
Me / Red Hat: SystemTAP / Frysk
(I got to do both talks).
What's the status of SystemTAP on the ARM? Ditto for Frysk.
Robert Winsiewski / IBM: Performance analys and debugging at IBM
It was as much about IBM as a few other companies Robert had worked for;
it have a general history of logging challenges in a number of
companies. Strongly in favor of the marker approach; and set that as a
theme. Two notable ideas were non-locked logging (the in-memory log
file format handled synchronization using atomic instructions); and
sharing memory logs between user and system.
Elena Zannoni / Oracle: Tracing at Oracle
Presented the challenges with using SystemTAP in a "binary only / clean
room" environment.
Beth Tibbits / IBM: Eclipse Parallel Tools Platform
Underneath they are using a consolidating process that then, in turn,
talks to a distributed collection of gdb processes (makes you cry :-);
this basic approach is described in Bevin Brett's paper on making
ladebug HPC. There's work to generalize this, see http://scalabletools.org/
Andrew McDermott / Wind River: Developing OS-agnostic visualization tools.
Discussed the "DB" approach for managing all that data.
Felix Burton / Wind River: Sensorpoint Technology
Wind Rivers rough equivalent to SystemTAP. Use "C" for the probes.
--
I was asked if SystemTAP is supported on arm (have e-mail address if
fche you want to contact them).