Accuracy of disk statistics I/O counter
Problem
(Transcribed from this mailing list message.)
The problem was a mismatch between expected and actual I/O counts/sizes resulting from a microbenchmark load.
Scripts
global rqs
global lun, id, channel, host_no
probe begin
{
host_no = 0
channel = 0
id = 7
lun = 12
}
probe module("*scsi_mod*").function("scsi_dispatch_cmd")
{
if (1 != $cmd->sc_data_direction) next
if (lun != $cmd->device->lun) next
if (id != $cmd->device->id) next
if (channel != $cmd->device->channel) next
if (host_no != $cmd->device->host->host_no) next
rqs[$cmd->request_bufflen / 1024]++
}
probe end
{
foreach (rec+ in rqs)
printf("%d %d\n", rec, rqs[rec])
exit()
}
Lessons
Sometimes the standard statistics provided by the kernel do not correspond to actual low-level activities. "SystemTAP really rocks for this type of analysis!"
