This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [BUG] syscall.unlink no longer works after upgrading kernel to 3.7.3
- From: Josh Stone <jistone at redhat dot com>
- To: Zheng Da <zhengda1936 at gmail dot com>
- Cc: Mark Wielaard <mjw at redhat dot com>, agentzh <agentzh at gmail dot com>, "systemtap at sourceware dot org" <systemtap at sourceware dot org>
- Date: Tue, 28 May 2013 15:51:20 -0700
- Subject: Re: [BUG] syscall.unlink no longer works after upgrading kernel to 3.7.3
- References: <CAB4Tn6PdW3GOa09z_tfjQs=F+7XLOqMr5+c5GourX5e0v8FMeQ at mail dot gmail dot com> <1360054656 dot 3837 dot 13 dot camel at bordewijk dot wildebeest dot org> <51114188 dot 60400 at redhat dot com> <CAFLer83DQhCQg7Y3NKR0EUYePzp+fETDTeYEthUXKarAySM0_g at mail dot gmail dot com> <20130528191449 dot GA31042 at toonder dot wildebeest dot org> <CAFLer81NE1bocCbufPTtLLZ-pZz2kVA5r3rKoCgjmc_6w+fwng at mail dot gmail dot com> <51A511FD dot 8010006 at redhat dot com> <20130528203223 dot GA768 at toonder dot wildebeest dot org> <CAFLer80-zMFK9-qqGFdsiacrYPmzD=-qxdMO_a+TP71XbpGxXA at mail dot gmail dot com>
On 05/28/2013 02:35 PM, Zheng Da wrote:
> Yes, it's my own script. Here is the code:
> probe kernel.function("scsi_device_unbusy") {
> if ($sdev->host->host_no == 9 && $sdev->id == 1) {
> printf("sdev on node %d, host on node %d\n",
> addr_to_node($sdev), addr_to_node($sdev->host));
> exit();
> }
> }
> The script works in Linux 3.2.12.
Ok, this also works on 3.9.2-200.fc18.x86_64. I don't hit that
particular host_no+id on my machine, but it is hitting the probe.
> systemtap actually can find the right location of scsi_device_unbusy,
> but it doesn't show its parameters.
> $ stap -L 'kernel.function("scsi_device_unbusy")'
> kernel.function("scsi_device_unbusy@drivers/scsi/scsi_lib.c:318")
>
> I run eu-readelf -N --debug-dump=info
> /usr/lib/debug/lib/modules/3.8.12/vmlinux and the info of
> scsi_device_unbusy is shown below:
> [43d72e9] subprogram
> external (flag) Yes
> name (strp) "scsi_device_unbusy"
> decl_file (data1) 1
> decl_line (data2) 318
> prototyped (flag) Yes
> low_pc (addr) 0xffffffff81480e80
> high_pc (addr) 0xffffffff81480f44
> frame_base (data4) location list [e061d3]
> sibling (ref4) [43d7492]
> [43d730b] formal_parameter
> name (strp) "sdev"
> decl_file (data1) 1
> decl_line (data2) 318
> type (ref4) [43d22ff]
> location (data4) location list [e06233]
...
> Josh, when you say "DWARF dump", do you mean the output of eu-readelf
> as I did above?
Yep, that's great. Next, can you try --debug-dump=loc and see the list
at [e06233] for sdev? This will hopefully reveal why it's not
available. On my Fedora 18 kernel, I get:
> [3b28227] subprogram
> external (flag_present) Yes
> name (strp) "scsi_device_unbusy"
> decl_file (data1) 1
> decl_line (data2) 323
> prototyped (flag_present) Yes
> low_pc (addr) 0xffffffff81420310
> high_pc (addr) 0xffffffff814203d4
> frame_base (exprloc)
> [ 0] call_frame_cfa
> GNU_all_call_sites (flag_present) Yes
> sibling (ref4) [3b28405]
> [3b28245] formal_parameter
> name (strp) "sdev"
> decl_file (data1) 1
> decl_line (data2) 323
> type (ref4) [3b22271]
> location (sec_offset) location list [d6aa75]
...
> [d6aa75] 0xffffffff81420315..0xffffffff8142033f [ 0] reg5
> 0xffffffff8142033f..0xffffffff814203a6 [ 0] reg3
> 0xffffffff814203a6..0xffffffff814203b4 [ 0] GNU_entry_value:
> [ 0] reg5
> [ 3] stack_value
> 0xffffffff814203b4..0xffffffff814203d4 [ 0] reg3
You can see that my function starts at 420310, yet sdev is first
specified at 420315. That's the 5-byte fentry call still padding it
away from the start, also seen in objdump -d:
> ffffffff81420310 <scsi_device_unbusy>:
> ffffffff81420310: e8 eb 92 24 00 callq ffffffff81669600 <__fentry__>
> ffffffff81420315: 55 push %rbp
> ffffffff81420316: 48 89 e5 mov %rsp,%rbp
> ffffffff81420319: 48 83 ec 20 sub $0x20,%rsp
But in my case, the heuristic of stap 45b02a36 appears to be working. I
can also set environment PR15123_DISABLE=1, and it will fail the same as
for you. Perhaps you could step through dwflpp::pr15123_retry_addr, and
see what's happening?
My best guess at this point is the check for "-mfentry" in
DW_AT_producer. I found a Yocto commit where they forced gcc to have
-grecord-gcc-switches, exactly for SystemTap's benefit, but then I'm not
sure why Fedora is able to manage without that option.
http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-3.8/commit/?h=standard/base&id=d9a45e3325030f7bd6f37947a7a0b12da7f602c3