This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH -tip v4 0/6] kprobes: introduce NOKPROBE_SYMBOL() and fixes crash bugs


(2013/12/07 4:07), Frank Ch. Eigler wrote:
> Hi, Masami -
> 
> masami.hiramatsu.pt wrote:
> 
>> [...]
>>>> [...]  Then, I'd like to propose this new whitelist feature in
>>>> kprobe-tracer (not raw kprobe itself). And a sysctl knob for
>>>> disabling the whitelist.  That knob will be
>>>> /proc/sys/debug/kprobe-event-whitelist and disabling it will mark
>>>> kernel tainted so that we can check it from bug reports.
>>>
>>> How would one assemble a reliable whitelist, if we haven't fully
>>> characterized the problems that make the blacklist necessary?
>>
>> As I said, we can use function graph tracer's list as the whitelist,
>> since it doesn't include any functions invoked from ftrace's event
>> handler. (Note that I don't mention the Systemtap or other user here)
>>
>> Whitelist is just for keeping the people away from the quantitative
>> issue, who just want to trace their subsystems except for ftrace.
>> [...]
> 
> Would you plan to limit kprobes (or just the perf-probe frontend) to
> only function-entries also?

Exactly, yes :). Currently I have a patch for kprobe-tracer
implementation (not only for perf-probe, but doesn't limit
kprobes itself).

>  If not, and if intra-function
> statement-granularity kprobes remain allowed within a
> function-granularity whitelist, then you might still have those
> "quantitative" problems.

Yes, but as far as I've tested, the performance overhead is not
high, especially as far as putting kprobes at the entry of those
functions because of ftrace-based optimization.

> Even worse, kprobes robustness problems can bite even with a small
> whitelist, unless you can test the countless subset selections
> cartesian-product the aggrevating factors (like other tracing
> facilities being in use at the same time, limited memory, high irq
> rates, debugging sessions, architectures, whatever).

And also, what script will run on each probe, right? :)

>> [...]  For the long term solution, I think we can introduce some
>> kind of performance gatekeeper as systemtap does. Counting the
>> miss-hit rate per second and if it go over a threshold, disable next
>> miss-hit (or most miss-hit) probe (as OOM killer does).
> 
> That would make sense, but again it would not help deal with kprobes
> robustness (in the kernel-crashing rather than kernel-slowdown sense).

Why would you think so? Is there any hidden path for calling kprobes
mechanism?? The kernel crash problem just comes from bugs, not the
quantitative issue.

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]