This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Sequence ordering and timestamps in systemtap...


All --

A new timestamp tapset has been checked in
(.../src/tapsets/timestamp/timestamp_tapset.txt) which discusses the
current thoughts and requirements for a reasonable ordering and
timestamp scheme (including a brief mention on how LTT and DTRACE do
things).

Regarding the email below: 

> -----Original Message-----
> From: Martin Hunt [mailto:hunt@redhat.com] 
> Sent: Wednesday, June 15, 2005 10:13 AM
> To: Spirakis, Charles
> Cc: William Cohen
> Subject: Re: Sequence ordering and timestamps in systemtap...
> 
> Good summary.  Do you want to post to the mailing list?
> 
> On Tue, 2005-06-14 at 16:22 -0700, Spirakis, Charles wrote:
> > For sequence ording, the initial implementation should use ?? the
> >    atomic_t form for the sequence ordering (since it is guaranteed
> >    to be platform and architecture neutral)?? and modify/change the
> >    implementation later if there is a problem.
> 
> After deciding last week that a sequence counter was probably 
> necessary, I spent this week looking into atomic variables 
> and SMP scalability as part of my performance testing of 
> aggregations and per-cpu maps. I have eliminated atomics.  On 
> large SMP systems and NUMA systems, it could be a real issue, 
> although I don't have a way to measure it.  But if we can 
> find a way to get something more efficient, that would be 
> great. I think timestamps would be good enough if they are 
> synchronized between cpus within a few hundred ns. Even with 
> sequence numbers, we don't really know for sure if event A 
> happened first but event B got written out first.
> 

For handling sequence numbers (or timestamps), how important is it to
have a unique value for each output vs. a unique value for each probe
execution? Specifically, if we are worried about the overhead, is it
reasonable to get that value once during each probe execution (on probe
entry if you want ordering based on probe execution or first use of
stp_xxx() if you want to only get the ordering for actual output) and
use that value for all stp_xxx()'s within the probe?

Are you willing to still conceptually separate out sequence numbering
from timestamp? If we truly can get the timestamp down to "nanoseconds
since 1970" and that value has reasonable resolution, then it may not be
as important, but there are times in the profiling world where you want
to get a trace history and having the elements ordered (or mostly
ordered) is very desirable.

In essence, just like there is a $timestamp, there could also be a
$sequence_number which is unique for any particular probe execution
environment. Assuming (for performance) the timestamp was also only
captured once per probe execution, then for some people: $timestamp ==
$sequence_number.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]