This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Linux Kernel Markers

From: "S. P. Prasanna" <prasanna at in dot ibm dot com>
To: Martin Bligh <mbligh at google dot com>
Cc: Andrew Morton <akpm at osdl dot org>, "Frank Ch. Eigler" <fche at redhat dot com>, Ingo Molnar <mingo at elte dot hu>, Mathieu Desnoyers <mathieu dot desnoyers at polymtl dot ca>, Paul Mundt <lethal at linux-sh dot org>, linux-kernel <linux-kernel at vger dot kernel dot org>, Jes Sorensen <jes at sgi dot com>, Tom Zanussi <zanussi at us dot ibm dot com>, Richard J Moore <richardj_moore at uk dot ibm dot com>, Michel Dagenais <michel dot dagenais at polymtl dot ca>, Christoph Hellwig <hch at infradead dot org>, Greg Kroah-Hartman <gregkh at suse dot de>, Thomas Gleixner <tglx at linutronix dot de>, William Cohen <wcohen at redhat dot com>, ltt-dev at shafik dot org, systemtap at sources dot redhat dot com, Alan Cox <alan at lxorguk dot ukuu dot org dot uk>
Date: Tue, 19 Sep 2006 12:35:16 +0530
Subject: Re: [PATCH] Linux Kernel Markers
References: <20060918234502.GA197@Krystal> <20060919081124.GA30394@elte.hu> <451008AC.6030006@google.com> <20060919154612.GU3951@redhat.com> <4510151B.5070304@google.com> <20060919093935.4ddcefc3.akpm@osdl.org> <45101DBA.7000901@google.com> <20060919063821.GB23836@in.ibm.com> <45102641.7000101@google.com>
Reply-to: prasanna at in dot ibm dot com

On Tue, Sep 19, 2006 at 10:17:53AM -0700, Martin Bligh wrote:
> >>>>It seems like all we'd need to do
> >>>>is "list all references to function, freeze kernel, update all
> >>>>references, continue"
> >>>
> >>>
> >>>"overwrite first 5 bytes of old function with `jmp new_function'".
> >>
> >>Yes, that's simple. but slower, as you have a double jump. Probably
> >>a damned sight faster than int3 though.
> >
> >
> >The advantage of using int3 over jmp to launch the instrumented
> >module is that int3 (or breakpoint in most architectures) is an
> >atomic operation to insert.
> 
> Ah, good point. Though ... how much do we care what the speed of
> insertion/removal actually is? If we can tolerate it being slow,
> then just sync everyone up in an IPI to freeze them out whilst
> doing the insert.
> 
I guess using IPI occasionally would be acceptable. But I think
using IPI for each probes will lots of overhead.

> 
> Surely this still carries the overhead of doing the breakpoint,
> which was part of what we were trying to get away from? I suppose
> we get more flexibility this way. Or does the slowness not actually
> come from the int3, but only the single-stepping?
Yes, it comes from int3 as well.
> 
> How about we combine all three ideas together ...
> 
> 1. Load modified copy of the function in question.
> 2. overwrite the first instruction of the routine with an int3 that
> does what you say (atomically)
> 3. Then overwrite the second instruction with a jump that's faster
> 4. Now atomically overwrite the int3 with a nop, and let the jump
> take over.
> 

That's a good solution.

Thanks
Prasanna

> >Adv:
> >Can be enabled/disabled dynamically by inserting/removing
> >breakpoints.  No overhead of single stepping.
> >No restriction of running the handler in interrupt context.
> >You can have pre-compiled instrumented routines.
> >This mechanism can be used for pre-defined set of routines and for
> >arbiratory probe points, you can use kprobes/jprobes/systemtap.
> >No need to be super-user for predefined breakpoints.
> >                                                                                                                                               
> >Dis:
> >Maintainence of the code, since it can code base need to be
> >duplicated and instrumented.
> 
> CONFIG_FOO_BAR .... turn it on or off to turn on the instrumentation.
> compiled out by default. Compiled in when making the tracing functions.
> 
> >The above idea is similar to runtime or dynamic patching, but here we
> >use int3(breakpoint) rather than jump instruction.
> 
> Depends what we're trying to fix. I was trying to fix two things:
> 
> 1. Flexibility - kprobes seem unable to access all local variables etc
> easily, and go anywhere inside the function. Plus keeping low overhead
> for doing things like keeping counters in a function (see previous
> example I mentioned for counting pages in shrink_list).
> 
> 2. Overhead of the int3, which was allegedly 1000 cycles or so, though
> faster after Ingo had played with it, it's still significant.
> 
> M.

-- 
Prasanna S.P.
Linux Technology Center
India Software Labs, IBM Bangalore
Email: prasanna@in.ibm.com
Ph: 91-80-41776329

Follow-Ups:
- Re: [PATCH] Linux Kernel Markers
  - From: Martin Bligh

References:
- [PATCH] Linux Kernel Markers
  - From: Mathieu Desnoyers
- Re: [PATCH] Linux Kernel Markers
  - From: Ingo Molnar
- Re: [PATCH] Linux Kernel Markers
  - From: Martin J. Bligh
- Re: [PATCH] Linux Kernel Markers
  - From: Frank Ch. Eigler
- Re: [PATCH] Linux Kernel Markers
  - From: Martin Bligh
- Re: [PATCH] Linux Kernel Markers
  - From: Andrew Morton
- Re: [PATCH] Linux Kernel Markers
  - From: Martin Bligh
- Re: [PATCH] Linux Kernel Markers
  - From: S. P. Prasanna
- Re: [PATCH] Linux Kernel Markers
  - From: Martin Bligh

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]