This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/17140] systemtap.examples/profiling/functioncallcount.stp causing kernel panic on s390x


https://sourceware.org/bugzilla/show_bug.cgi?id=17140

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
OK, I've finally narrowed this one down a bit more. There are 2 problems here.

There are 2 functions that crash the kernel when a kprobe is placed on them,
without systemtap involved. I verified this using the scripts down in
src/scripts/kprobes_test. They are:

  set_pageblock_flags_group()
  lookup_page_cgroup()

I've filed bugzilla bugs on each of those:

  BZ1123425 - kprobe on set_pageblock_flags_group() causes kernel panic on
s390x
  BZ1123429 - kprobe on lookup_page_cgroup() causes kernel panic on s390x

We will probably need to add those functions to the blacklist. With those 2
functions removed from the list produced by:

  stap -l 'kernel.function("*@mm/*.c").call'

I still see a crash. So, I modified the scripts in src/scripts/kprobes_test to
build systemtap modules instead of straight kernel modules. After running that,
it appears that the following function is the culprit:

  free_pages()

Here's the crash you get when probing free_pages():

====
[ 6071.705497] Kernel BUG at 00000000002118b6 [verbose debug info unavailable]
[ 6071.705535] specification exception: 0006 [#1] SMP
[ 6071.705537] Modules linked in: probe_module(OF) tun ext4 mbcache jbd2 loop
sg
 qeth_l2 vmur nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c dasd_fba_mod
l
cs ctcm fsm dasd_eckd_mod qeth qdio dasd_mod ccwgroup dm_mirror dm_region_hash
d
m_log dm_mod [last unloaded: probe_module]
[ 6071.705564] CPU: 0 PID: 34156 Comm: basename Tainted: GF         
O----------
----   3.10.0-123.el7.s390x #1
[ 6071.705568] task: 000000007c87daa0 ti: 0000000068d50000 task.ti:
0000000068d5
0000
[ 6071.705571] Krnl PSW : 0704e00180000000 00000000002118b6
(__free_pages+0x36/0
x90)
[ 6071.705580]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0
EA:
3
Krnl GPRS: 0000000000000001 0000000000000001 000000000acfcecc 000000000acfcecd
[ 6071.705588]            000003ff7fffffff 0000000000000000 0000000080000000
000
000003fda0008
[ 6071.705592]            0000000068d53e00 00000000ae64dfff 00000000ae64e000
000
000001d1e9738
[ 6071.706153] ------------[ cut here ]------------
[ 6071.706154] Kernel BUG at 00000000002118b6 [verbose debug info unavailable]
[ 6071.708530]            0000000000000002 0000000000747a01 0000000068d53c30
000
0000068d53c08
[ 6071.708543] Krnl Code: 00000000002118ac: d01c18231b21        trtr   
2083(29,
%r1),2849(%r1)
          #00000000002118b2: ba32d01c           cs      %r3,%r2,28(%r13)
          >00000000002118b6: a744fffc           brc     4,2118ae
           00000000002118ba: ec260010007e       cij     %r2,0,6,2118da
           00000000002118c0: b904002d           lgr     %r2,%r13
           00000000002118c4: ecc80012007c       cgij    %r12,0,8,2118e8
           00000000002118ca: b904003c           lgr     %r3,%r12
           00000000002118ce: c0e5ffffeef5       brasl   %r14,20f6b8
[ 6071.708595] Call Trace:
[ 6071.708599] ([<00000000ae64dfff>] 0xae64dfff)
[ 6071.708606]  [<000000000023316a>] free_pgd_range+0x40a/0x480
[ 6071.708613]  [<00000000002332ce>] free_pgtables+0xee/0x148
[ 6071.708619]  [<000000000023e84c>] 01: HCPGSP2629I The virtual machine is
plac
ed in CP mode due to a SIGP stop from
 CPU 01.
exit_mmap+0x12c/0x1c8
[ 6071.708656]  [<000000000012d8ae>] mmput+0x7e/0x138
[ 6071.708659]  [<000000000013723e>] do_exit+0x2be/0xa88
[ 6071.708663]  [<0000000000137abe>] do_group_exit+0x4e/0xe0
[ 6071.708679]  [<0000000000137b7a>] SyS_exit_group+0x2a/0x30
[ 6071.708682]  [<00000000005b1c1c>] sysc_tracego+0x14/0x1a
[ 6071.708687]  [<000003fffd624694>] 0x3fffd624694
[ 6071.708690] Last Breaking-Event-Address:
[ 6071.708692]  [<0000000000211920>] free_pages.part.49+0x10/0x18
[ 6071.708696]
[ 6071.708698] Kernel panic - not syncing: Fatal exception: panic_on_oops
[ 6071.708701] specification exception: 0006 [#2] SMP
[ 6071.708706] Modules linked in:00: HCPGIR450W CP entered; disabled wait PSW
00
020001 80000000 00000000 0010DEEE
====

Here's the source to free_pages():

====
void free_pages(unsigned long addr, unsigned int order)                         
{                                                                               
        if (addr != 0) {                                                        
                VM_BUG_ON(!virt_addr_valid((void *)addr));                      
                __free_pages(virt_to_page((void *)addr), order);                
        }                                                                       
}                                                                               
====

So, I'm guessing we're hitting that VM_BUG_ON().

-- 
You are receiving this mail because:
You are the assignee for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]