This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/20735] New: "soft lockup" bug on RHEL7 ppc64


https://sourceware.org/bugzilla/show_bug.cgi?id=20735

            Bug ID: 20735
           Summary: "soft lockup" bug on RHEL7 ppc64
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: dsmith at redhat dot com
  Target Milestone: ---

I'm seeing the following "soft lockup" bug fairly consistently when running the
testsuite in parallel mode on RHEL7 ppc64 (3.10.0-327.el7.ppc64.debug):

====
Oct 22 14:43:55 ibm-p8-01-lp7.lab.eng.rdu.redhat.com kernel: BUG: soft lockup -
CPU#11 stuck for 31s! [migration/11:94]
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Modules linked in:
stap_0ea83aa31510554f18d088ec7736640f__5685(OE+)
stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+)
stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE)
stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rpcgss
nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif
crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt
dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_a35e54245d1be418047481fe89e36ea1__1925]
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: irq event stamp: 121576
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last  enabled at (121575):
[<c000000000a1a148>] ._raw_spin_unlock_irqrestore+0x48/0xc0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: hardirqs last disabled at (121576):
[<c00000000000290c>] decrementer_common+0x10c/0x180
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last  enabled at (120802):
[<c0000000000e27d0>] .__do_softirq+0x220/0x5b0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: softirqs last disabled at (120797):
[<c00000000002703c>] .call_do_softirq+0x14/0x24
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CPU: 11 PID: 94 Comm: migration/11
Tainted: G           OE  ------------   3.10.0-327.el7.ppc64.debug #1
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: task: c0000005f4ec7880 ti:
c0000005f4fd8000 task.ti: c0000005f4fd8000
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP: c000000000010c50 LR:
c000000000010c50 CTR: c000000000020940
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: REGS: c0000005f4fdb810 TRAP: 0901  
Tainted: G           OE  ------------    (3.10.0-327.el7.ppc64.debug)
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: MSR: 8000000100009032 <SF,EE,ME,IR,DR,RI>
 CR: 24000024  XER: 00000000
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: CFAR: c000000000020970 SOFTE: 1
                                                             GPR00:
c000000000a1a154 c0000005f4fdba90 c000000001a1c730 0000000000000900
                                                             GPR04:
0000000000000001 0000000000000001 0350280000000000 00000000040b4432
                                                             GPR08:
0000000000000000 000000000000119a 000fe0d0d288f434 d00000000d9fa1b0
                                                             GPR12:
0000000022000824 00003fff964c8da0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: NIP [c000000000010c50]
.arch_local_irq_restore+0x100/0x150
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: LR [c000000000010c50]
.arch_local_irq_restore+0x100/0x150
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb00] [c000000000a1a154]
._raw_spin_unlock_irqrestore+0x54/0xc0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbb80] [c0000000001d77ac]
.cpu_stop_should_run+0x5c/0x80
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbc00] [c0000000001313d4]
.smpboot_thread_fn+0x244/0x340
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbcb0] [c00000000011f468]
.kthread+0xe8/0xf0
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: [c0000005f4fdbe30] [c00000000000a47c]
.ret_from_kernel_thread+0x58/0xdc
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: Instruction dump:
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: e8690010 786307e2 3063ffff 7c631910
7863b7a6 78635624 38630e80 4bffff6c
Oct 22 14:43:55 ibm-p8-01-lp7 kernel: 7c0802a6 f8010010 f821ff91 4bff1769
<60000000> 38210070 e8010010 7c0803a6
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task kworker/11:2:381 blocked for
more than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: kworker/11:2    D 0000000000000000  8976 
 381      2 0x00000800
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Workqueue: events .module_refresher
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb520] [d00000000dfb0000]
0xd00000000dfb0000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb6f0] [c00000000001835c]
.__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb7a0] [c000000000a164a4]
.__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb8c0] [c000000000a18130]
.schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeb940] [c000000000a11b70]
.mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eeba50] [d00000000d924970]
.systemtap_module_refresh+0x160/0x650
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebb00] [c00000000011035c]
.process_one_work+0x25c/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebbe0] [c000000000110ad0]
.worker_thread+0x170/0x490
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebcb0] [c00000000011f468]
.kthread+0xe8/0xf0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005e9eebe30] [c00000000000a47c]
.ret_from_kernel_thread+0x58/0xdc
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 4 locks held by kworker/11:2/381:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0:  (events){++++.+}, at:
[<c0000000001102c8>] .process_one_work+0x1c8/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #1:  ((&module_refresher_work)){+.+...},
at: [<c0000000001102c8>] .process_one_work+0x1c8/0x860
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #2:  (module_refresh_mutex){+.+...}, at:
[<d00000000d92485c>] .systemtap_module_refresh+0x4c/0x650
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #3:  (&ut->inode_lock#2){+.+.+.}, at:
[<d00000000d924970>] .systemtap_module_refresh+0x160/0x650
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task echo:5102 blocked for more
than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: echo            D 00003fff93f43cc0  8240 
5102  19520 0x0000a082
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3440] [d00000000dfb0000]
0xd00000000dfb0000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3610] [c00000000001835c]
.__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e36c0] [c000000000a164a4]
.__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e37e0] [c000000000a18130]
.schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3860] [c000000000a11b70]
.mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3970] [d00000000d927fa8]
.stapiu_process_found+0x208/0x660
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3a30] [d00000000d9268a4]
.__stp_tf_quiesce_worker+0x154/0x230
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3ad0] [c000000000119fc4]
.task_work_run+0x104/0x160
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3b70] [c0000000000fc74c]
.get_signal_to_deliver+0xc9c/0xdd0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3c70] [c00000000001a234]
.do_signal+0x54/0x320
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3db0] [c00000000001a65c]
.do_notify_resume+0x8c/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005475e3e30] [c00000000000a7b0]
.ret_from_except_lite+0x5c/0x60
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 1 lock held by echo/5102:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0:  (&ut->inode_lock#2){+.+.+.}, at:
[<d00000000d927fa8>] .stapiu_process_found+0x208/0x660
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: INFO: task staprun:5108 blocked for more
than 120 seconds.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: staprun         D 00003fff8d298318  7824 
5108  29619 0x00008080
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb3c0] [d00000000de70000]
0xd00000000de70000 (unreliable)
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb590] [c00000000001835c]
.__switch_to+0x26c/0x4b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb640] [c000000000a164a4]
.__schedule+0x4e4/0xd10
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb760] [c000000000a18130]
.schedule_preempt_disabled+0x40/0xc0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb7e0] [c000000000a11b70]
.mutex_lock_nested+0x240/0x5b0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb8f0] [d00000000d92485c]
.systemtap_module_refresh+0x4c/0x650
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fb9a0] [d00000000d92521c]
._stp_module_notifier+0x6c/0x170 [stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fba30] [c000000000a20c18]
.notifier_call_chain+0xb8/0x200
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbae0] [c00000000012bffc]
.blocking_notifier_call_chain+0x7c/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbb70] [c0000000001b1f50]
.load_module+0x1840/0x20d0
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbd40] [c0000000001b28d0]
.SyS_init_module+0xf0/0x170
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: [c0000005809fbe30] [c00000000000a188]
system_call+0x38/0xb4
Oct 22 14:45:09 ibm-p8-01-lp7 kernel: 2 locks held by staprun/5108:
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #0: 
((module_notify_list).rwsem){++++.+}, at: [<c00000000012bfe0>]
.blocking_notifier_call_chain+0x60/0x100
Oct 22 14:45:09 ibm-p8-01-lp7 kernel:  #1:  (module_refresh_mutex){+.+...}, at:
[<d00000000d92485c>] .systemtap_module_refresh+0x4c/0x650
[stap_16af15cbdec255c8ee6892d56fef69df_8_4819]
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: INFO: rcu_sched detected stalls on
CPUs/tasks: { 11} (detected by 1, t=6503 jiffies, g=7870, c=7869, q=0)
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Task dump for CPU 11:
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: swapper/11      R  running task    10560 
   0      1 0x00000804
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: Call Trace:
Oct 22 14:45:29 ibm-p8-01-lp7 kernel: [c0000005f4a539a0] [0000000000000058]
0x58 (unreliable)
Oct 22 14:46:55 ibm-p8-01-lp7 kernel: BUG: soft lockup - CPU#11 stuck for 65s!
[swapper/11:0]
Oct 22 14:46:55 ibm-p8-01-lp7 kernel: Modules linked in:
stap_011c8d5b3f3c577d0cebde949ae1e2fe_8_6219(OE+)
stap_0ea83aa31510554f18d088ec7736640f__5685(OE+)
stap_fae50114ebc4a4659eb3c14ebac75168_5108(OE+)
stap_16af15cbdec255c8ee6892d56fef69df_8_4819(OE)
stap_bcdff2ed83de7602b65808ca8e0bf37d__4756(OE) sg pseries_rng nfsd auth_rpcgss
nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif
crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt
dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
stap_a35e54245d1be418047481fe89e36ea1__1925]
=====

-- 
You are receiving this mail because:
You are the assignee for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]