Patched the kernel to the latest Fedora FC4 Linux localhost.localdomain 2.6.13-1.1532_FC4 #1 Thu Oct 20 01:30:08 EDT 2005 i686 athlon i386 GNU/Linux this kernel should have support for Multiple kprobes at an address http://lwn.net/Articles/132787/ running two copies of this systemtap oopes the system. global called probe kernel.function("__kmalloc") { called++; } stap version. [jamesd@localhost ~]$ stap -V SystemTap translator/driver (version 0.4.2 built 2005-10-31) Copyright (C) 2005 Red Hat, Inc. and others This is free software; see the source for copying conditions. [jamesd@localhost ~]$
Created attachment 745 [details] Simple module to probe __kmalloc I don't see the problem on my system (i386 SMP), using either stap or raw kprobes. Take the attached probe2x.c, make a copy (call it probe1x.c), and in the copy change all "2x" to "1x". Compile both and insmod them both. Do stuff. rmmod both. Works for me.
Subject: Fwd: Multiple kprobes at an address, Doesn't work ---------- Forwarded message ---------- From: James Dickens <jamesd.wi@gmail.com> Date: Nov 6, 2005 11:03 AM Subject: Re: [Bug kprobes/1813] Multiple kprobes at an address, Doesn't work To: sourceware-bugzilla@sourceware.org On 5 Nov 2005 01:24:44 -0000, jkenisto at us dot ibm dot com <sourceware-bugzilla@sourceware.org > wrote: > > ------- Additional Comments From jkenisto at us dot ibm dot com 2005-11-05 01:24 ------- > Created an attachment (id=745) > --> (http://sourceware.org/bugzilla/attachment.cgi?id=745&action=view) > Simple module to probe __kmalloc > > I don't see the problem on my system (i386 SMP), using either stap or raw > kprobes. > > Take the attached probe2x.c, make a copy (call it probe1x.c), and in the copy > change all "2x" to "1x". Compile both and insmod them both. Do stuff. rmmod > both. Works for me. > > -- okay i can't find the magic arguments to make it compile at the command line, me and a friend both see this on there systems. What kernel versions are you using? gcc version? i'm using fedora FC4, installed the latest elfutils and the latest kernel, the rest is bone stock. > http://sourceware.org/bugzilla/show_bug.cgi?id=1813 > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
I have reproduced this problem on the RHEL4U2 kernel (22.EL). The problem appears to be that __kmalloc is invoked from the registration function of subsequent kprobe session startup. This trips the int3 placed from within the first probe. Indirectly, this appears to lead to a kprobe reentrancy based panic. It would require analysis or experiments to determine whether the RCU lockless code fares any better. Unfortunately one can't reasonably kludge around this defect by using the new translator blacklist to enumerate every area of the kernel possibly used during a registration.
Created attachment 751 [details] partial panic screenshot
I dont see either a crash/panic on my i386 smp as well as uni processor box running vmlinuz-2.6.13-1.1532_FC4. Could you please check if the problem exits with this kernel. I will check with RHEL4U2 kernel. -Prasanna
The conceptual problem remains, even if one happens to be unable to reproduce some particular test case. If any of the kernel services transitively involved in performing kprobe administration (registration, unregistration, probe triggering, etc.) are possibly probed by another kprobes/systemtap session, we get an instant reentrancy situation. The RCU kprobes may or may not handle this better, but it needs analysis not experimentation to ascertain.
Frank's analysis in Comment #3 is correct. register_kprobe() grabs the kprobe_lock and then, if there's already a probe at that address, calls register_aggr_kprobe(), which may call __kmalloc() (via kcalloc()). This is OK because it's a GFP_ATOMIC allocation. In pre-RCU versions of Kprobes, Kprobes runs handlers while holding the kprobe_lock. Thus, if there's a probe on __kmalloc(), we deadlock if register_aggr_kprobe() is called. So the failure we see here is due to the probe on __kmalloc() combined with registering two probes at the same address (ANY address). This is not a problem in the RCU version of Kprobes (e.g., RHEL4 U3 in recent days), because Kprobes holds no locks while running handlers. So the specific problem in question is fixed, and it's safe to probe *alloc(). If RH is concerned about the general problem of Kprobes behaving badly when you ask it to probe itself (which the Kprobes documentation specifically advises against), then the first step would be to pick up Prasanna's __kprobes-declaration patches from the mainline kernel.
Would it be possible to have you (RCU) folks write up a systemtap test case that attempts to break on several worst-case probe points that are transitively reachable from kprobes registration/execution/unregistration routines? Something more aggressive than just __kmalloc? While this test would crash at the moment, it would be a good stress test for the new RCU baseline.
This problem no longer exists with RCU. Regression test http://sources.redhat.com/cgi-bin/cvsweb.cgi/tests/kernel/kzalloc_crash_bz1813/?cvsroot=systemtap confirms this. We use kzalloc now and hence the probes are on __kzalloc. Changing it to __kmalloc too doesn't produce any failures.