This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: stap vs pgrps


> Let's not focus on the particular red herring of Eclipse.  Suppose this
> was our stapgui:

But that's a red herring.  It wouldn't be like that.  When you implement
something in a shell script, it has batch-like behavior and only gets
killed "from the outside", i.e. by things that kill its whole pgrp.

A real script would not be making the decision to kill it, and would have
its post-kill stuff done via a "trap 'echo stapgui done' 1 2 15" or so forth.

> That's why I do think stap needs to care, rather than trying to pass the
> buck up to the caller.

stap is a command-line batch-style program like any other.  It is not
expected to deal with this for you.  That is just the Unix way.  The
thing that runs it and wants to send it a SIGTERM is *just wrong* for
running any normal command-line batch-style program.

If you are a non-batch thing that runs and controls a batch thing, then you
are responsible for the fancy magic that "controls" really means.  i.e.,
you are a job control shell or run one, as inside "expect" or any terminal
window sort of thing.  (Those use a pty, which is also the only way to be
right in the face of potential privilege issues.)

When you have a setuid program you might need to kill, this is really all
moot.  You don't have to agree with me.  You just can't do it any other
way.  You're not allowed.  (Find a GUI program that runs "ping" and see
whether it does some magic or whether it's actually just broken, 'cause
those are the only two options.)

> We have such a flag already, but I was worried about the race between
> checking it and blocking on waitpid again.  I suppose I still have a
> race anyway, so how can you manage this?  Or is the collision just so
> rare that no one bothers?

No, it's a horrible mess with signal masks and still having holes in the
atomicity.  Almost the easiest thing is having it be in a separate thread
that you pthread_cancel.  (Thread cancellation has the kind of interlock
with blocking calls like waitpid that you want.)  

In this case, all you want to do is one simple kill call, which is safe in
a signal handler.  So you can just do the kill in the signal handler.  Then
ignore EINTR, i.e. either use SA_RESTART (if you are not worrying about
interrupting anything else but this waitpid) or use TEMP_FAILURE_RETRY
(unistd.h macro for an EINTR-checking loop).  You don't want to bail early,
you just want to block in wait until the killed child is dead.

If one cares about being robust during the "staprun" phase, then this
really is all moot.  It's setuid and the stap process is not allowed to
send it any signals with kill.

And why is it that stap forks at all instead of just exec'ing staprun?
It seems like the cleanup it does could be done by stapio before
it execs staprun for the final unload.


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]