This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Missing appname in lket output


Hi David,

Thanks for trying. Here are my answers:

David Boreham wrote:
The first is that sometimes the appname is missing.
It's not there either in the text output nor in the SQL database
after processing by lket_b2a. I am using the latest code
from cvs.

My command is:

stap -bM -e "probe addevent.iosyscall.* {}" -v

After LKET starts, process_snapshot() will be called, but addevent.process.fork and addevent.process.execve won't be turned on by default. So if you have any new process to be forked/execved after LKET starts up, you will lost the appname info. This is why you see the appname listed as NULL.


So you can:
stap -bMv -e "probe addevent.iosyscall {} probe addevent.process {}"
to capture the newly created processes.

I ever thought of turning on fork/execve events by default, but finally I left it to the users to decide whether they want to capture such events. Do you think we should turn on fork/execve capture by default since they have a negligible overhead?


I also noticed that the event name is no longer written to the ascii output file. This appears to be deliberate since there is no code to do it now (but the man page shows this data in the output so presumably it was at one time).

I don't remember that LKET ever wrote event names into lket.out. Only group id and hook id is written. Can you paste here a snippet of the output?



BTW dumping to SQL is either a really cool idea or really, really, annoying : I'm not sure which yet ;) I feel like the developers that will understand enough about SQL to data mine the dump may not overlap with the set of developers that are interested in kernel tracing in the first place. I keep wanting to write a C or at least Python program to crawl the trace files and extract the stuff I'm looking for. I'm particularly interested in the cluster of syscalls related to each file : correlating the open/read/write/fsync/close calls relating to the same file. I know that I could write some hideous join statement in SQL to do the same but my SQL expertise is not quite at that level yet. e.g. I want to find all system calls that took > 10ms to execute : the data is there in the table but ugh, I need to go read a SQL book to figure out the query. If that information were in the ascii output file I could write a simple script to grok the records, compute the delta between the entry and return timestamps and print out any that exceeded 10ms.

Couple of reasons for postprocessing in MySQL:
1. faster speed, capable to handle large volume data
2. standard interface, SQL is very good at querying
3. can be used as a backend of different user specific application, such as those written in php, java etc.
4. Easier to correlate different events(tables) to dig out userful info
5. avoid read the whole trace data again and again to find out different information.


But I agree with you that it requires you have SQL knowledge, especially when you want optimize the query. I ever thought of writing a collection of store procedures to be shipped with lket-b2a. Just like the application libraries such as glibc which provide the desired functionality while encapsulate the implementation details, the goal is that what the uses need to do now is only to pick up a store procedure from the database and call it, and the store procedure will give the user the desired result. So the user doesn't need to know much database table structure detail, nor much knowledge of relation database.Here I paste an example I wrote before about this. A more complicated example will have passing in arguments to make it more flexible:

This example will generate top 10 most frequently syscalls

the table to store syscall trace data is defined as:

mysql> describe syscall;
+---------+---------------------+------+-----+---------+-------+
| Field   | Type                | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+-------+
| hookid  | tinyint(3) unsigned | NO   |     | NULL    |       |
| usec    | int(10) unsigned    | NO   | MUL | NULL    |       |
| pid     | int(10) unsigned    | NO   |     | NULL    |       |
| cpu     | tinyint(3) unsigned | NO   |     | NULL    |       |
| sysname | varchar(24)         | NO   |     | NULL    |       |
+---------+---------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

The data stored in syscall table looks like:

mysql> select * from syscall limit 6;
+--------+--------+------+-----+--------------+
| hookid | usec   | pid  | cpu | sysname      |
+--------+--------+------+-----+--------------+
|      1 | 112080 | 2942 |   6 | rt_sigaction |
|      2 | 112089 | 2942 |   6 | rt_sigaction |
|      1 | 112097 | 2942 |   6 | alarm        |
|      2 | 112102 | 2942 |   6 | alarm        |
|      1 | 112107 | 2942 |   6 | rt_sigaction |
|      2 | 112110 | 2942 |   6 | rt_sigaction |
+--------+--------+------+-----+--------------+
6 rows in set (0.00 sec)

Here the hookid=1 means entry of a syscall and hookid=2 means return from a syscall

The store procedure to generate top 10 most frequently being called syscalls is defined as:

mysql> create procedure top10_syscall()
    -> begin
    ->   select sysname, count(*) from syscall
    ->   where hookid=1
    ->     group by sysname
    ->     order by count(*) DESC
    ->     limit 10;
    -> end
    -> //
Query OK, 0 rows affected (0.01 sec)


Now the user can just call the procedure to get the top 10 syscalls:


mysql> call top10_syscall();
+--------------------+----------+
| sysname            | count(*) |
+--------------------+----------+
| kill               |  2646443 |
| getdents           |  1382303 |
| stat               |  1160844 |
| fstat              |  1139168 |
| close              |  1028697 |
| pread64            |   717298 |
| fcntl              |   694464 |
| compat_sys_fcntl64 |   694463 |
| pwrite64           |   230360 |
| unlink             |    92181 |
+--------------------+----------+
10 rows in set (17.84 sec)

Query OK, 0 rows affected (17.84 sec)


Anyway, LKET is cool so enough complaining.





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]