This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Interesting reading regarding dtrace, aggregation andbuffers...


 

> -----Original Message-----
> From: systemtap-owner@sources.redhat.com 
> [mailto:systemtap-owner@sources.redhat.com] On Behalf Of Martin Hunt
> Sent: Thursday, June 16, 2005 2:11 PM
> To: Spirakis, Charles
> Cc: William Cohen; systemtap@sources.redhat.com
> Subject: Re: Interesting reading regarding dtrace, 
> aggregation andbuffers...
> 
> On Thu, 2005-06-16 at 13:22 -0700, Spirakis, Charles wrote:
> 
> > Is this why they use the XXX = count() syntax (and other 
> aggregation 
> > specific functions)? So they can store part of the information in 
> > kernel space, flush when appropriate to user space, then do 
> the final 
> > aggregation in user space?
> 
> I don't think the internals have anything to do with the syntax.
> 
> > Note how they define aggregation at the top (which gives them this 
> > ability: f(f(x1) U f(x2) U ...) == f(x1 U x2
> > U...) ).
> 
> That basically defines an aggregation.  It is data that can 
> be collected per-cpu and later combined.
> 

When I was talking to Will about profiling, to have the initial
discussion, I used a syntax like:

profile_count[$process_name][$image_name][$ptregs->eip] += 1

Will mentioned this would be a problem since the amount of data was
large and the associative arrays weren't for that purpose. What's
interesting is that the Sun methodology of aggregation allows for this.
Specifically, because of the way they are aggregating, they can
implement:

@profile_count[$process_name][$image_name][$ptregs->eip] = @count()

Where each cpu handles whatever it sees by itself (per-cpu buffering)
and when the association/aggregation space becomes full, it just flushes
what it has to user space. User space can then aggregate each of the
flushed buffers. Given that this is possible with their @count and not
with +=1 (which is also allowed, but not "an aggregator"), I suspect the
implementation did help define the syntax. In essence, aggregation
variables/functions are "write-only" as far as the probe is concerned
whereas regular variables are read/write.

> > As for the buffering methodology:
> > http://docs.sun.com/app/docs/doc/817-6223/6mlkidlho?a=view
> > 
> > By default, they are per-cpu, double buffered. They do 
> provide a lot 
> > of flexibility in how the buffers are managed.
> 
> When I was talking about "tagged data" a while ago I was 
> doing this because I wanted to be able to send data in 
> specific formats so that it could be processed in user space. 
>  That is not going to make it into the initial release.
> 
> For basic aggregations (no keys), there is no need for 
> user-space storage.  For aggregated maps (which I've been 
> calling per-cpu maps) we could eventually use user-space 
> storage. It wouldn't be hard; when their internal storage 
> gets full, just dump the aggregated stats.

Do you have any more information about aggregated maps?


> 
> We also have associative arrays (maps) which are treated like 
> global variables.  They can read and modified.  They don't 
> scale like per-cpu maps and they can't use user-space storage.
> 
> (If you are looking, I have aggregations done and am 
> currently documenting them. I will get those checked in then. 
> per-cpu maps are not yet finished.)

Thanks, I'll take a look!

> 
> FYI, I have only defined two aggregations, Counter and Stat. 
> Counter is a per-cpu counter.  It counts events more 
> efficiently than an atomic, but is not appropriate when you 
> want to be reading it often.  Stats handles counts, sums, 
> min, max, average and histograms. 
> 
> Martin
> 
> 
> 
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]