This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [BUG] syscall.unlink no longer works after upgrading kernel to 3.7.3


Hello,

On Tue, May 28, 2013 at 6:51 PM, Josh Stone <jistone@redhat.com> wrote:
> On 05/28/2013 02:35 PM, Zheng Da wrote:
>> Yes, it's my own script. Here is the code:
>> probe kernel.function("scsi_device_unbusy") {
>>     if ($sdev->host->host_no == 9 && $sdev->id == 1) {
>>         printf("sdev on node %d, host on node %d\n",
>> addr_to_node($sdev), addr_to_node($sdev->host));
>>         exit();
>>     }
>> }
>> The script works in Linux 3.2.12.
>
> Ok, this also works on 3.9.2-200.fc18.x86_64.  I don't hit that
> particular host_no+id on my machine, but it is hitting the probe.
>
>> systemtap actually can find the right location of scsi_device_unbusy,
>> but it doesn't show its parameters.
>> $ stap -L 'kernel.function("scsi_device_unbusy")'
>> kernel.function("scsi_device_unbusy@drivers/scsi/scsi_lib.c:318")
>>
>> I run eu-readelf -N --debug-dump=info
>> /usr/lib/debug/lib/modules/3.8.12/vmlinux and the info of
>> scsi_device_unbusy is shown below:
>>  [43d72e9]    subprogram
>>              external             (flag) Yes
>>              name                 (strp) "scsi_device_unbusy"
>>              decl_file            (data1) 1
>>              decl_line            (data2) 318
>>              prototyped           (flag) Yes
>>              low_pc               (addr) 0xffffffff81480e80
>>              high_pc              (addr) 0xffffffff81480f44
>>              frame_base           (data4) location list [e061d3]
>>              sibling              (ref4) [43d7492]
>>  [43d730b]      formal_parameter
>>                name                 (strp) "sdev"
>>                decl_file            (data1) 1
>>                decl_line            (data2) 318
>>                type                 (ref4) [43d22ff]
>>                location             (data4) location list [e06233]
> ...
>> Josh, when you say "DWARF dump", do you mean the output of eu-readelf
>> as I did above?
>
> Yep, that's great.  Next, can you try --debug-dump=loc and see the list
> at [e06233] for sdev?  This will hopefully reveal why it's not
> available.  On my Fedora 18 kernel, I get:
>
>>  [3b28227]    subprogram
>>              external             (flag_present) Yes
>>              name                 (strp) "scsi_device_unbusy"
>>              decl_file            (data1) 1
>>              decl_line            (data2) 323
>>              prototyped           (flag_present) Yes
>>              low_pc               (addr) 0xffffffff81420310
>>              high_pc              (addr) 0xffffffff814203d4
>>              frame_base           (exprloc)
>>               [   0] call_frame_cfa
>>              GNU_all_call_sites   (flag_present) Yes
>>              sibling              (ref4) [3b28405]
>>  [3b28245]      formal_parameter
>>                name                 (strp) "sdev"
>>                decl_file            (data1) 1
>>                decl_line            (data2) 323
>>                type                 (ref4) [3b22271]
>>                location             (sec_offset) location list [d6aa75]
> ...
>>  [d6aa75]  0xffffffff81420315..0xffffffff8142033f [   0] reg5
>>            0xffffffff8142033f..0xffffffff814203a6 [   0] reg3
>>            0xffffffff814203a6..0xffffffff814203b4 [   0] GNU_entry_value:
>>        [   0] reg5
>>                                                   [   3] stack_value
>>            0xffffffff814203b4..0xffffffff814203d4 [   0] reg3
>
> You can see that my function starts at 420310, yet sdev is first
> specified at 420315.  That's the 5-byte fentry call still padding it
> away from the start, also seen in objdump -d:
>
>> ffffffff81420310 <scsi_device_unbusy>:
>> ffffffff81420310:     e8 eb 92 24 00          callq  ffffffff81669600 <__fentry__>
>> ffffffff81420315:     55                      push   %rbp
>> ffffffff81420316:     48 89 e5                mov    %rsp,%rbp
>> ffffffff81420319:     48 83 ec 20             sub    $0x20,%rsp

I think I can also see the 5-byte difference here.

 [e061d3]  0xffffffff81480e80..0xffffffff81480e86 [   0] breg7 8
           0xffffffff81480e86..0xffffffff81480e89 [   0] breg7 16
           0xffffffff81480e89..0xffffffff81480f43 [   0] breg6 16
           0xffffffff81480f43..0xffffffff81480f44 [   0] breg7 8
 [e06233]  0xffffffff81480e85..0xffffffff81480eae [   0] reg5
           0xffffffff81480edf..0xffffffff81480f36 [   0] reg3
 [e06269]  0xffffffff81480ea3..0xffffffff81480eae [   0] breg5 0
           0xffffffff81480eae..0xffffffff81480eb2 [   0] breg3 0
           0xffffffff81480eb2..0xffffffff81480f3e [   0] reg13

Output of objdump -d:
ffffffff81480e80 <scsi_device_unbusy>:
ffffffff81480e80:       e8 7b 6c 22 00          callq
ffffffff816a7b00 <__fentry__>
ffffffff81480e85:       55                      push   %rbp
ffffffff81480e86:       48 89 e5                mov    %rsp,%rbp
ffffffff81480e89:       48 83 ec 20             sub    $0x20,%rsp

>
> But in my case, the heuristic of stap 45b02a36 appears to be working.  I
> can also set environment PR15123_DISABLE=1, and it will fail the same as
> for you.  Perhaps you could step through dwflpp::pr15123_retry_addr, and
> see what's happening?
>
> My best guess at this point is the check for "-mfentry" in
> DW_AT_producer.  I found a Yocto commit where they forced gcc to have
> -grecord-gcc-switches, exactly for SystemTap's benefit, but then I'm not
> sure why Fedora is able to manage without that option.
>
> http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-3.8/commit/?h=standard/base&id=d9a45e3325030f7bd6f37947a7a0b12da7f602c3
>
I added lines in the source code of systemtap to print msg when
dwflpp::pr15123_retry_addr returns 0.
The problem is that the producer string returned by
dwarf_formstring(&cudie_producer) is "GNU C 4.6.3". There isn't
"-mfentry". I guess that is what you mean.
Do you want me to use -grecord-gcc-switches to rebuilt the kernel?

Thanks,
Da


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]