See bug #908 comment #7. The problem appears to be the runtime's eagerness to
wake up stpd regardless of context. It needs to protect itself against being
invoked from such places, for example by deferring such signalling until it's
safe to do so (say, using a background timer).
Generally, you should review each kernel call made from within the runtime to
assess whether it is universally safe, or can/should be detected otherwise.
Those functions that are only called during module init/exit need only a lesser
degree of care. But those callable from probe handlers need to be paranoid.
Looks nasty. From within __switch_to() we cannot do printk(), log() or
schedule anything to happen later, AFAICT. That's OK, because if we can detect
unsafe conditions, we can always put the data in a buffer and let the next IO
trigger the wake_up().
What exactly do you think should have been detected to determine that IO was
unsafe from within this function? Because I am not sure.
I don't know if there is a runtime test for safety. The most pessimistic
approach is to always buffer, and use a background task of some sort to do all I/O.
We might be able to do some blacklisty thing with the assistance of the
translator. The embedded-C code can tell where the probe point was inserted. A
refined form would be able to test whether sensitive files like *sched.* were
involved in a kprobe. That could trigger special behavior.
This is being worked on.
The problem may not be limited to I/O. I believe the entire runtime needs to be
audited to enumerate and analyze **all** kernel functions used from within probe
context. Any of these that contain critical sections, or call sensitive
subsidiary functions, need to be avoided if at all possible. This avoidance can
include replication of kernel code if needed (since we will guarantee that no
introspective probe can be placed on a systemtap probe module).
*** Bug 1837 has been marked as a duplicate of this bug. ***
Another method of ameliorating the problem is consistent use of auxiliary worker
threads to carry out any kernel services that cannot be safely called and thus
need to be deferred. Signalling between the probes and these worker threads (or
just one) would of course have to be simple and not itself involve kernel services.
*** Bug 1919 has been marked as a duplicate of this bug. ***
*** Bug 1594 has been marked as a duplicate of this bug. ***
Checked in fix. Needs better testing, of course.
Please construct some tests that aim to stress this new code. For example, some
systemtap script to probe & log printk, hw interrupt handlers, and the like.