This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: offline elfutils processing committed
On Tue, 2006-11-07 at 09:34 -0500, Frank Ch. Eigler wrote:
> Hi -
>
> hunt wrote:
>
> > [...] What happens when systemtap's allocations succeed, but leave
> > the system in a low memory state such that other applications
> > trigger the oom killer when they try to allocate memory. In this
> > case, we want staprun and the systemtap module to be first to be
> > killed. [...]
>
> Since the systemtap module rather than staprun owns most of the
> memory, and because by its nature the module reacts relatively slowly
> to staprun's demise, biasing staprun for OOM targeting may not
> meaningfully assist the system in a time of need.
Right now I have staprun getting SIGKILL from __oom_kill_task and it
signals the end probe functions to run before unloading the module. If
we decide this is too slow a reaction, we can always just unload the
module immediately. We certainly don't want to depend on the code in the
module that periodically checks for staprun's existence.
> Also, preferring to kill staprun/etc. under such conditions is not
> obviously correct. One might argue that once a systemtap script is
> running, it deserves to be kept alive no less than any other process:
> it may be running precisely because the sysadmin wanted to monitor the
> system. Heck, it might be in the middle of debugging excessive memory
> consumption problems.
That's a good point, although it is hard to imagine sysadmins would
often prefer to trust oom-killer to randomly kill processes rather than
remove systemtap scripts. Perhaps we need a command line option to set
that.
Martin