This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/22155] New: kernel panic due to NULL vma_cache_p->f_path.dentry


https://sourceware.org/bugzilla/show_bug.cgi?id=22155

            Bug ID: 22155
           Summary: kernel panic due to NULL vma_cache_p->f_path.dentry
           Product: systemtap
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: penguin-kernel@i-love.sakura.ne.jp
  Target Milestone: ---

I encountered an oops with systemtap-3.1-3.el7.x86_64. The location was

  if (path->dentry->d_op && path->dentry->d_op->d_dname &&

in d_path(), which means that path->dentry == NULL for some reason.

Looking at __stp_call_mmap_callbacks_for_task() (runtime/linux/task_finder.c
or runtime/linux/task_finder2.c , which one is in use?), it counts number of
vma->vm_file != NULL entries in tsk->mm->mmap and allocates memory for such
entries and remembers the address of vma->vm_file->f_path like

  vma_cache_p->f_path = &(vma->vm_file->f_path);

with tsk->mm->mmap_sem held for read, and the address of f_path
(instead of f_path.dentry and f_path.mnt) is later passed to d_path()
after tsk->mm->mmap_sem is released.

Is tsk->mm->mmap_sem held for read sufficient for guaranteeing that

  // First find the number of file-based vmas.
  vma = mm->mmap;
  while (vma) {
    if (vma->vm_file)
      file_based_vmas++;
      vma = vma->vm_next;
  }

and

  vma = mm->mmap;
  vma_cache_p = vma_cache;
  while (vma) {
    if (vma->vm_file) {
      // Notice we're increasing the reference
      // count for 'f_path'.  This way it won't
      // get deleted from out under us.
      vma_cache_p->f_path = &(vma->vm_file->f_path);
      path_get(vma_cache_p->f_path);
      vma_cache_p->dentry = vma->vm_file->f_path.dentry;
      vma_cache_p->addr = vma->vm_start;
      vma_cache_p->length = vma->vm_end - vma->vm_start;
      vma_cache_p->offset = (vma->vm_pgoff << PAGE_SHIFT);
      vma_cache_p->vm_flags = vma->vm_flags;
      vma_cache_p++;
    }
    vma = vma->vm_next;
  }

fills all entries in vma_cache_p ?

Assuming that above is correct, although we took reference on
f_path.dentry and f_path.mnt via path_get(), what guarantees that
vma->vm_file->f_path is still valid (i.e. f_path.dentry and f_path.mnt
do not change) between tsk->mm->mmap_sem is released and d_path() is called?
Isn't there a race window that the memory region pointed by vma->vm_file
changes (i.e. vma->vm_file->f_path == { garbage, garbage }) ?

I feel that "struct vma_cache_t" needs to use "struct path" rather than
"struct path *".

----------
[    0.000000] Linux version 3.10.0-514.26.2.el7.x86_64
(builder@kbuilder.dev.centos.org) (gcc version 4.8.5 20150623 (Red Hat
4.8.5-11) (GCC) ) #1 SMP Tue Jul 4 15:04:05 UTC 2017
(...snipped...)
[ 2670.046497] Kprobes globally unoptimized
[ 2670.047387] stap_c80c762b8a8bcff4b69879d41fae40ef_1_2980: loading
out-of-tree module taints kernel.
[ 2670.047830] stap_c80c762b8a8bcff4b69879d41fae40ef_1_2980: module
verification failed: signature and/or required key missing - tainting kernel
[ 2670.135913] stap_c80c762b8a8bcff4b69879d41fae40ef_1_2980: systemtap:
3.1/0.166, base: ffffffffa03eb000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[ 3726.377000] stap_c80c762b8a8bcff4b69879d41fae40ef_1_4019: systemtap:
3.1/0.166, base: ffffffffa043f000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[10281.381269] stap_c80c762b8a8bcff4b69879d41fae40ef_1_6247: systemtap:
3.1/0.166, base: ffffffffa03e6000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[15632.200303] stap_c80c762b8a8bcff4b69879d41fae40ef_1_8026: systemtap:
3.1/0.166, base: ffffffffa043a000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[16109.057836] stap_c80c762b8a8bcff4b69879d41fae40ef_1_8883: systemtap:
3.1/0.166, base: ffffffffa03e6000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[16578.618229] stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881: systemtap:
3.1/0.166, base: ffffffffa043a000, memory:
214data/116text/241ctx/2063net/33348alloc kb, probes: 48
[81608.487805] BUG: unable to handle kernel NULL pointer dereference at
0000000000000060
[81608.487995] IP: [<ffffffff812168f8>] d_path+0x38/0x170
[81608.488145] PGD 5e1fd067 PUD 5deb7067 PMD 0 
[81608.488220] Oops: 0000 [#1] SMP 
[81608.488276] Modules linked in:
stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881(OE) vmw_vsock_vmci_transport vsock
intel_powerclamp coretemp iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel
lrw gf128mul glue_helper ablk_helper cryptd ppdev vmw_balloon sg pcspkr shpchp
parport_pc parport i2c_piix4 vmw_vmci ip_tables xfs libcrc32c sr_mod cdrom
sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi crct10dif_pclmul
crct10dif_common vmwgfx drm_kms_helper crc32c_intel syscopyarea sysfillrect
sysimgblt fb_sys_fops ttm serio_raw mptspi drm scsi_transport_spi mptscsih
vmxnet3 mptbase ata_piix libata i2c_core floppy fjes dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: stap_c80c762b8a8bcff4b69879d41fae40ef_1_8883]
[81608.491292] CPU: 0 PID: 25868 Comm: java Tainted: G           OE 
------------   3.10.0-514.26.2.el7.x86_64 #1
[81608.494534] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 04/14/2014
[81608.497591] task: ffff880137320000 ti: ffff880137da0000 task.ti:
ffff880137da0000
[81608.499174] RIP: 0010:[<ffffffff812168f8>]  [<ffffffff812168f8>]
d_path+0x38/0x170
[81608.500896] RSP: 0018:ffff880137da3d90  EFLAGS: 00010246
[81608.502863] RAX: ffff880137e0f000 RBX: ffff8800b709e310 RCX:
0000000000000000
[81608.504573] RDX: 0000000000001000 RSI: ffff880137e0e000 RDI:
0000000000000000
[81608.506223] RBP: ffff880137da3dd0 R08: 00007f475ad31000 R09:
0000000000001000
[81608.508052] R10: 00000000000000e4 R11: ffffea0004df8200 R12:
ffff880137e0e000
[81608.510484] R13: ffff88013722a010 R14: ffff880137e0efab R15:
ffff880137228c60
[81608.513263] FS:  00007f471583f700(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[81608.516228] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[81608.519215] CR2: 0000000000000060 CR3: 000000006e07a000 CR4:
00000000000407f0
[81608.522110] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[81608.525089] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[81608.527954] Stack:
[81608.530743]  0000100008100073 ffff880137e0f000 00007f475ad31000
ffff880137320000
[81608.533765]  0000000052e3c232 ffff880137320000 ffff880137e0e000
ffff88013722a010
[81608.536778]  ffff880137da3e30 ffffffffa0453629 0000000000017000
0000000008100073
[81608.539867] Call Trace:
[81608.542637]  [<ffffffffa0453629>]
__stp_call_mmap_callbacks_for_task+0x169/0x240
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
[81608.548067]  [<ffffffffa045385c>]
__stp_utrace_task_finder_target_quiesce+0x15c/0x2b0
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
[81608.553490]  [<ffffffffa043ec7e>] start_callback.isra.48+0x7e/0x100
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
[81608.559129]  [<ffffffffa0442a09>] utrace_resume+0x109/0x410
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
[81608.564108]  [<ffffffff810ad1e7>] task_work_run+0xa7/0xe0
[81608.566703]  [<ffffffff8102ab22>] do_notify_resume+0x92/0xb0
[81608.569549]  [<ffffffff81697abd>] int_signal+0x12/0x17
[81608.571756] Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 28 48 8b 7f 08 48
89 44 24 08 89 54 24 04 65 48 8b 0c 25 28 00 00 00 48 89 4c 24 20 31 c9 <48> 8b
47 60 48 85 c0 74 2f 48 8b 40 40 48 85 c0 74 26 ff d0 48 
[81608.578430] RIP  [<ffffffff812168f8>] d_path+0x38/0x170
[81608.581039]  RSP <ffff880137da3d90>
[81608.583132] CR2: 0000000000000060
----------

----------
crash> bt -l
PID: 25868  TASK: ffff880137320000  CPU: 0   COMMAND: "java"
 #0 [ffff880137da3a20] machine_kexec at ffffffff81059beb
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/kernel/machine_kexec_64.c:
319
 #1 [ffff880137da3a80] __crash_kexec at ffffffff81105822
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/kernel/kexec.c:
1491
 #2 [ffff880137da3b50] crash_kexec at ffffffff81105910
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/include/asm/atomic.h:
38
 #3 [ffff880137da3b68] oops_end at ffffffff81690008
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/kernel/dumpstack.c:
225
 #4 [ffff880137da3b90] no_context at ffffffff8167fc96
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/mm/fault.c:
703
 #5 [ffff880137da3be0] __bad_area_nosemaphore at ffffffff8167fd2c
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/mm/fault.c:
782
 #6 [ffff880137da3c28] bad_area at ffffffff81680050
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/mm/fault.c:
811
 #7 [ffff880137da3c50] __do_page_fault at ffffffff81692f4f
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/mm/fault.c:
1164
 #8 [ffff880137da3cb0] do_page_fault at ffffffff81692ff5
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/mm/fault.c:
1237
 #9 [ffff880137da3ce0] page_fault at ffffffff8168f208
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/kernel/entry_64.S:
1316
    [exception RIP: d_path+56]
    RIP: ffffffff812168f8  RSP: ffff880137da3d90  RFLAGS: 00010246
    RAX: ffff880137e0f000  RBX: ffff8800b709e310  RCX: 0000000000000000
    RDX: 0000000000001000  RSI: ffff880137e0e000  RDI: 0000000000000000
    RBP: ffff880137da3dd0   R8: 00007f475ad31000   R9: 0000000000001000
    R10: 00000000000000e4  R11: ffffea0004df8200  R12: ffff880137e0e000
    R13: ffff88013722a010  R14: ffff880137e0efab  R15: ffff880137228c60
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/fs/dcache.c:
2940
#10 [ffff880137da3dd8] __stp_call_mmap_callbacks_for_task at ffffffffa0453629
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
#11 [ffff880137da3e38] __stp_utrace_task_finder_target_quiesce at
ffffffffa045385c [stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
#12 [ffff880137da3e70] start_callback at ffffffffa043ec7e
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
#13 [ffff880137da3ea8] utrace_resume at ffffffffa0442a09
[stap_c80c762b8a8bcff4b69879d41fae40ef_1_9881]
#14 [ffff880137da3f00] task_work_run at ffffffff810ad1e7
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/kernel/task_work.c:
89
#15 [ffff880137da3f30] do_notify_resume at ffffffff8102ab22
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/include/linux/tracehook.h:
196
#16 [ffff880137da3f50] int_signal at ffffffff81697abd
   
/usr/src/debug/kernel-3.10.0-514.26.2.el7/linux-3.10.0-514.26.2.el7.x86_64/arch/x86/kernel/entry_64.S:
620
    RIP: 00007f47752fd3dc  RSP: 00007f471583c6b0  RFLAGS: 00000293
    RAX: 0000000000000000  RBX: 00007f47200069e0  RCX: ffffffffffffffff
    RDX: 00000000ffffd9a8  RSI: 0000000000002658  RDI: 00007f4772ba8541
    RBP: 00007f471583c6d0   R8: 0000000000000000   R9: 000000000000000c
    R10: 00007f475d33b18f  R11: 0000000000000293  R12: 00007f471c058b40
    R13: 00007f471c059f50  R14: 00007f471c0582e0  R15: 00007f471c05b7e0
    ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b
----------

-- 
You are receiving this mail because:
You are the assignee for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]