This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: help needed with froked process becoming zombies


On Thu, Aug 23, 2012 at 10:00 PM, Christoph Anton Mitterer
<calestyo@scientia.net> wrote:
> So,... the website says "No question about glibc is ever wrong on this
> list." about this list... and I hope asking about process control and
> that like isn't too off topic.

No. That's a valid question :-)

> Perhaps someone can help us here with quite some problem, appearing in
> Nagios and Icinga on.
> The issue is traced in the following bug reports:
> https://dev.icinga.org/issues/2546
> http://tracker.nagios.org/view.php?id=321
>
> Michael Friedrich from Icinga made some comments
> (https://dev.icinga.org/issues/2546#note-26) on where to look best for
> the respective code areas.
> All happens basically in the run_async_service_check() function within
> base/checks.c
>
> If I can do anything to help, just tell!

(1) What problem are you trying to solve?

It seems to me that the zombie processes are a result of the problem
and not the problem itself. The problem itself is that you get
timeouts.

Please confirm?

If that's the case then you need to provide a better test case.

(2) Strace logs.

In order to determine what is going wrong you need to provide full
strace logs of everything you are running. I suggest using the -ff and
-ttt options and -o options to output one log file per PID. That way
we can look at the behaviour and see what goes wrong.

(3) forker.c code.

I'm not going to look at Nagios code, but I *am* going to look at your
example forker.c code.

You must call wait on the child or the kernel keeps around the child
in order to deliver the return value. The alternative is to ignore
SIGCHLD in the parent and then the kernel knows it should not keep the
child around and should reap the child on exit (not leaving a zombie).

If the parent exits or dies before the child is waited upon then the
child is reparented to init, and then init will reap the child.

As I said in point (1), I think you are looking at the effect not the cause.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]