This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Probing for Zombie Processes?


Nathan DeBardeleben wrote:
> I love the SystemTap wiki, by the way.  The war stories are great - I
> encourage more of them! :)
> On that note, I am having a problem with processes going zombie on me
> - more often than I normally am used to and thought SystemTap might be
> useful in this regard.  But, before I dive into it:
> 
> 1: has anyone dug into looking at zombie processes?
> 2: anyone have any insight into what might be good areas to probe with
> respect to this
> 3: can anyone think of a better / easier tool that I should look into
> instead?
> 
> I'm hoping I might be able to record something more useful than just
> "PID xyz went to state=zombie".

My understanding is that most processes will become zombies for a short
period, until their parent is notified through a wait() call.  So I
think when a process gets stuck as a zombie, it's because the parent did
something wrong.

"process.exit" will tell you when a process has completed execution, and
"process.release" tells you when the process is actually marked for
deletion.  If you're stuck as a zombie, you'll see an exit but no
release.  You could also probe the function exit_notify, which sets up
the exit state as either EXIT_DEAD or EXIT_ZOMBIE.

The hard thing is that you're really trying to discover something that's
*not* happening -- namely the parent's wait.  Perhaps probing this
syscall would be enlightening.  You could also use the signals tapset to
look out for the SIGCHLD to the parent.

If you find a way to make sense of this, it's definitely a good one for
the war stories...


Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]