This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Implementing a generic binary trace interface.


Hi,

Jose R. Santos wrote:
> Hi folks,
>
> My team is currently implementing a trace tool using SystemTap that
> currently does logging by means of printf mechanism.  We want to move to
> a binary trace format but there is no such mechanism on SystemTap for
> doing this.  I've looked at what the folks at Hitachi have done with
> BTI, but this seems to force a specific trace format that is not
> suitable for what we need.  Ideally, the trace format should be left to
> the tapset using the interface not the BTI.  I propose to slightly alter
> the BTI from Hitachi to allow other trace implementations to use the
> trace format that's most convenient for the people implementing them.

We had recognized the problems of previous BTI implementation. And we
have developed more generic method called gBTI (generic Binary Transport
Interface) which can be used with current ATI.
gBTI itself uses no formatted structure but an array of long (64bit integer).
It just transports those several binary data from tapset to user daemon. It
works like IP packet envelope.

To support this gBTI, I think, systemtap just introduce only three features below:
1. _stp_binary_write() function in runtime.
2. stpd enhancement to handle binary data correctly.
3. some wrapping functions for user scripts. (i.e. lket_trace(), binary_log(), etc)
So I think supporting gBTI is much easier and simpler than supporting other
method.

1. runtime function
gBTI provides a runtime function called _stp_binary_write() for tapsets.
_stp_binary_write() should be invoked with a number as the first argument and
variable argument lists. In other words, the synopsis is:

 void _stp_binary_write(int num, ...);

The "num" argument specifies the number of arguments following this argument.
The type of other variables must be int64_t.
For example, you can use any format of followings;

 _stp_binary_write(3, (int64_t)arg1, (int64_t)arg2, (int64_t)arg3);
 _stp_binary_write(2, (int64_t)current->pid, (int64_t)current);
 _stp_binary_write(5, (int64_t)lkst_header1(), (int64_t)lkst_header2(),
                  (int64_t)etype, (int64_t)arg1, (int64_t)arg2);

 And so on.

The _stp_binary_write() function writes the data as a binary packet into a
relayfs channel which is shared by current ATI. The gBTI also share the
sequential-ID with the ATI by using _stp_seq_inc(). The packet format is
described below.

gBTI packet:
[seq-id][\0][num][arg1][arg2]....[arg(num-1)]

In the other hands, ATI packet is;
[seq-id][string][\0]
(Length of string is greater than 0, this is checked by runtime functions.)

Thus, gBTI can share the channel safely, because if the packet whose first
character of the string is '\0', it is binary data packet.

2. stpd enhancement
We should enhance stpd daemon to handle both ascii packet and binary packet
correctly. But it is not so difficalt.

3. wrapping functions
I also think we can define the various interfaces for example LKST, LKET,
or more generic binary_log() interface over the gBTI runtime.

What would you think about the gBTI?

I have already a concrete implementation of gBTI(for relayfs). I developed
it as a tapset script including runtime part.
I attach two files to this mail.

- gbti.stp:      gBTI core runtime (_stp_binary_write())
- lkst_gbti.stp: LKST compatible wrapping function of gBTI

Please review it.

Best regards,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: hiramatu@sdl.hitachi.co.jp

// generic Binary Transport Interface script
// Copyright (C) 2006 Hitachi, Ltd., Systems Development Laboratory
// Written by Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>
// Instead of replacing whole runtime, this file includes a C source.

%{
#ifndef _GBTI_C_ /* -*- linux-c -*- */
#define _GBTI_C_

#include <linux/config.h>
#include <linux/percpu.h>
#include "io.c"

#ifndef MAXBINARYARGS
#define MAXBINARYARGS 127
#endif

#define STP_BIN_PACKET 0

struct stp_packet_h { 
	int32_t seq;
	u_int8_t flag;
	u_int8_t num;
}  __attribute__((packed));

#ifdef STP_RELAYFS

/* need to disable irqs */
static void _stp_binary_write (int num, ...)
{
	va_list vargs;
	int i;
	void *ptr;
	unsigned length;

	if (unlikely(num == 0))
		return;

	if (unlikely(num > MAXBINARYARGS))
		num = MAXBINARYARGS;
	length = sizeof(struct stp_packet_h) + num*sizeof(int64_t);

	/*
	 * I wonder if I should disable irqs here. SystemTap probe
	 * is not reentrant. Aren't we already doing this atomically?
	 */
	{
	unsigned long flags;
	local_irq_save(flags);
	ptr = relay_reserve(_stp_chan, length);
	local_irq_restore(flags);
	}

	if (ptr != 0) {
		struct stp_packet_h *hd = ptr;
		int64_t *args = ptr + sizeof(struct stp_packet_h);
		hd->seq = _stp_seq_inc();
		hd->flag = STP_BIN_PACKET;
		hd->num = num;
		va_start(vargs, num);
		for (i = 0; i < num; i++) {
			args[i] = va_arg(vargs, int64_t);
		}
		va_end(vargs);
	}
}
#endif

#endif /* _GBTI_C_ */
%}

function gbti_init () %{
	/* do nothing */
%}
// lkst generic binary transport interface script
// Copyright (C) 2006 Hitachi, Ltd., Systems Development Laboratory
// Written by Masami Hiramatsu <hiramatu@sdl.hitachi.co.jp>

%{
static void _stp_binary_write (int num, ...);
%}
/*
I'd like to provide new interface:
function binary_log:unkown (num:long, ...)
It should be a special embeded function.
*/

/* This interface is basically same as classic LKST. */
function lkst_trace_n(n:long, etype:long, arg1:long, arg2:long, arg3:long, arg4:long) %{
	int64_t tsc;
	rdtscll(tsc);
	/* The size of classic LKST's etype is 12bits.
	 On the systemtap, it is expanded to 24 bits.*/
	_stp_binary_write (THIS->n+2,
			  THIS->etype << 40 | 
			  (int64_t)smp_processor_id() << 32 | current->pid,
			  tsc, THIS->arg1, THIS->arg2, 
			  THIS->arg3, THIS->arg4);
%}

function lkst_trace(etype:long, arg1:long, arg2:long, arg3:long, arg4:long) %{
	int64_t tsc;
	rdtscll(tsc);
	_stp_binary_write (6,
			  THIS->etype << 40 | 
			  (int64_t)smp_processor_id() << 32 | current->pid,
			  tsc, THIS->etype, THIS->arg1, THIS->arg2, 
			  THIS->arg3, THIS->arg4);
%}


probe begin {
      /* initial event */
      gbti_init()
      log("start tracing")
      lkst_trace_n (4, EVENT_SYNCTIME, gettimeofday_s(), gettimeofday_us(),
      		   cpu_khz(), get_tsc());
}

probe end {
      log("end tracing")
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]