This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: offline elfutils processing committed


On Mon, 2006-11-06 at 14:15 -0800, Stone, Joshua I wrote:
> On Monday, November 06, 2006 1:18 PM, Martin Hunt wrote:
> > The point is damage control.  Systemtap allocates too much memory and
> > oom killer gets active, the first thing it will kill is staprun and
> > that should unload the module (but this seems broken at the moment). 
> > So we haven't really hurt the system.
> 
> The goal is fine, but I don't think this accomplishes it.  My
> understanding is that __alloc_pages will keep calling OOM until it is
> able to satisfy the request -- thus the module is blocked waiting for
> memory.  The process might end up something like:
> 
> stap module: allocate lots of memory
> __alloc_pages: Not enough memory -> OOM kill something (staprun)
> __alloc_pages: Still not enough memory -> OOM kill other stuff
> __alloc_pages: Yay, now we have enough memory!
> stap module: got some memory
> stap module: Oops, staprun is gone, better exit...

There are 2 different, but related problems. The one you describe is
easily fixed by using the GFP_NORETRY flag on our allocs.  The second
problem is the one I was trying to describe. What happens when
systemtap's allocations succeed, but leave the system in a low memory
state such that other applications trigger the oom killer when they try
to allocate memory.  In this case, we want staprun and the systemtap
module to be first to be killed. I haven't looked at the sources, but it
seems unlikely to me that the oom killer would be so fast that it would
kill staprun and then kill other processes before the module is also
killed and frees it's memory.

Martin



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]