This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [Ksummit-2008-discuss] DTrace
Hi -
On Tue, Jul 01, 2008 at 07:13:27PM -0400, Theodore Tso wrote:
> [...] And one of the major flaws of the Linux's RAS tools is that
> the LKML development community doesn't use them; and to the extent
> that tapsets would be written more quickly if they are easy for
> kernel developers who aren't depending on distro packaging and
> distro building of systemtap. [...]
Please excuse my return to this point, but it meshes with something
else:
> probe kernel.function ("vfs_write"),
> kernel.function ("vfs_read")
> {
> dev_nr = $file->f_dentry->d_inode->i_sb->s_dev
> inode_nr = $file->f_dentry->d_inode->i_ino
>
> if (dev_nr == ($1 << 20 | $2) # major/minor device
> && inode_nr == $3)
> printf ("%s(%d) %s 0x%x/%u\n",
> execname(), pid(), probefunc(), dev_nr, inode_nr)
> }
So, one way a kernel developer could help write a tapset piece for us
is to encapsulate this into a tapset script fragment:
probe vfs.read = kernel.function ("vfs_read")
{
dev_nr = $...expression
inode_nr = $...expression
}
Then this definition would be shipped with the kernel or systemtap,
tested in one or the other build system for currency. (Not by
coincidence, something much like that is already in our tapset, just
lacks those two values.)
Then the end user just does
probe vfs.read { if (dev_nr != MKDEV(2,3)) printf ("whatever you want to print") }
**** or ****
Kernel maintainers could add a marker or two right into their C code:
vfs_read()
{
/* ... */
trace_mark (vfs_read, "dev %u inode %u whatever %s",
expression1, expression2, whatever);
/* ... */
}
And that's it. It's compiled-in, and checked as a part of your
routine builds. Then the systemtap-side interpration code is trivial,
and anyone can write it. And it doesn't require debugging data.
probe vfs.read = kernel.mark("vfs_read") { dev_nr = $arg1; inode_nr = $arg2 }
probe vfs.read = kernel.mark("vfs_read") { dev_nr = $arg1; inode_nr = $arg2 }
If people could get over the funny look of the markers (since
performance effects have been shown to be negligible), they could make
a significant contribution to this problem, with just a few lines of C
code.
- FChE