This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/17140] systemtap.examples/profiling/functioncallcount.stp causing kernel panic on s390x
- From: "dsmith at redhat dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sourceware dot org
- Date: Wed, 30 Jul 2014 19:17:01 +0000
- Subject: [Bug runtime/17140] systemtap.examples/profiling/functioncallcount.stp causing kernel panic on s390x
- Auto-submitted: auto-generated
- References: <bug-17140-6586 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=17140
--- Comment #1 from David Smith <dsmith at redhat dot com> ---
OK, I've finally narrowed this one down a bit more. There are 2 problems here.
There are 2 functions that crash the kernel when a kprobe is placed on them,
without systemtap involved. I verified this using the scripts down in
src/scripts/kprobes_test. They are:
set_pageblock_flags_group()
lookup_page_cgroup()
I've filed bugzilla bugs on each of those:
BZ1123425 - kprobe on set_pageblock_flags_group() causes kernel panic on
s390x
BZ1123429 - kprobe on lookup_page_cgroup() causes kernel panic on s390x
We will probably need to add those functions to the blacklist. With those 2
functions removed from the list produced by:
stap -l 'kernel.function("*@mm/*.c").call'
I still see a crash. So, I modified the scripts in src/scripts/kprobes_test to
build systemtap modules instead of straight kernel modules. After running that,
it appears that the following function is the culprit:
free_pages()
Here's the crash you get when probing free_pages():
====
[ 6071.705497] Kernel BUG at 00000000002118b6 [verbose debug info unavailable]
[ 6071.705535] specification exception: 0006 [#1] SMP
[ 6071.705537] Modules linked in: probe_module(OF) tun ext4 mbcache jbd2 loop
sg
qeth_l2 vmur nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c dasd_fba_mod
l
cs ctcm fsm dasd_eckd_mod qeth qdio dasd_mod ccwgroup dm_mirror dm_region_hash
d
m_log dm_mod [last unloaded: probe_module]
[ 6071.705564] CPU: 0 PID: 34156 Comm: basename Tainted: GF
O----------
---- 3.10.0-123.el7.s390x #1
[ 6071.705568] task: 000000007c87daa0 ti: 0000000068d50000 task.ti:
0000000068d5
0000
[ 6071.705571] Krnl PSW : 0704e00180000000 00000000002118b6
(__free_pages+0x36/0
x90)
[ 6071.705580] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0
EA:
3
Krnl GPRS: 0000000000000001 0000000000000001 000000000acfcecc 000000000acfcecd
[ 6071.705588] 000003ff7fffffff 0000000000000000 0000000080000000
000
000003fda0008
[ 6071.705592] 0000000068d53e00 00000000ae64dfff 00000000ae64e000
000
000001d1e9738
[ 6071.706153] ------------[ cut here ]------------
[ 6071.706154] Kernel BUG at 00000000002118b6 [verbose debug info unavailable]
[ 6071.708530] 0000000000000002 0000000000747a01 0000000068d53c30
000
0000068d53c08
[ 6071.708543] Krnl Code: 00000000002118ac: d01c18231b21 trtr
2083(29,
%r1),2849(%r1)
#00000000002118b2: ba32d01c cs %r3,%r2,28(%r13)
>00000000002118b6: a744fffc brc 4,2118ae
00000000002118ba: ec260010007e cij %r2,0,6,2118da
00000000002118c0: b904002d lgr %r2,%r13
00000000002118c4: ecc80012007c cgij %r12,0,8,2118e8
00000000002118ca: b904003c lgr %r3,%r12
00000000002118ce: c0e5ffffeef5 brasl %r14,20f6b8
[ 6071.708595] Call Trace:
[ 6071.708599] ([<00000000ae64dfff>] 0xae64dfff)
[ 6071.708606] [<000000000023316a>] free_pgd_range+0x40a/0x480
[ 6071.708613] [<00000000002332ce>] free_pgtables+0xee/0x148
[ 6071.708619] [<000000000023e84c>] 01: HCPGSP2629I The virtual machine is
plac
ed in CP mode due to a SIGP stop from
CPU 01.
exit_mmap+0x12c/0x1c8
[ 6071.708656] [<000000000012d8ae>] mmput+0x7e/0x138
[ 6071.708659] [<000000000013723e>] do_exit+0x2be/0xa88
[ 6071.708663] [<0000000000137abe>] do_group_exit+0x4e/0xe0
[ 6071.708679] [<0000000000137b7a>] SyS_exit_group+0x2a/0x30
[ 6071.708682] [<00000000005b1c1c>] sysc_tracego+0x14/0x1a
[ 6071.708687] [<000003fffd624694>] 0x3fffd624694
[ 6071.708690] Last Breaking-Event-Address:
[ 6071.708692] [<0000000000211920>] free_pages.part.49+0x10/0x18
[ 6071.708696]
[ 6071.708698] Kernel panic - not syncing: Fatal exception: panic_on_oops
[ 6071.708701] specification exception: 0006 [#2] SMP
[ 6071.708706] Modules linked in:00: HCPGIR450W CP entered; disabled wait PSW
00
020001 80000000 00000000 0010DEEE
====
Here's the source to free_pages():
====
void free_pages(unsigned long addr, unsigned int order)
{
if (addr != 0) {
VM_BUG_ON(!virt_addr_valid((void *)addr));
__free_pages(virt_to_page((void *)addr), order);
}
}
====
So, I'm guessing we're hitting that VM_BUG_ON().
--
You are receiving this mail because:
You are the assignee for the bug.