This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/4] Mark nscd service as forking in systemd service file (#16639)


Russ, thank you for your very detailed explanation!

On Wed, Feb 26, 2014 at 02:47:03PM -0800, Russ Allbery wrote:
> The advice that you got here doesn't seem to be quite right.  It seems to
> me like it's confusing the readiness protocol with the way systemd
> monitors processes.
> 
> The bug report here was that nscd failed to start, but systemctl didn't
> realize that and returned a zero status.  This is a readiness problem.  If
> you use Type=simple, systemd will assume that the process is ready
> (started) as soon as the binary is execed.  systemctl will then return
> success at that point, and will not be aware of the subsequent failure.
> But in the case of nscd, that's not actually true; it's *not* ready to
> answer queries as soon as the binary has been run with exec.  It has more
> startup work to do, and that work can fail.
> 
> Normally, socket activation means you don't care that much if the daemon
> is not ready to answer requests immediately, since the requests will queue
> in the socket that was created by systemd, and you want to assume
> immediate readiness.  That's why a lot of services that use socket
> activation don't bother with readiness notification.  But that assumption
> does *not* hold if the service can fail without ever answering a request,
> *and* you want to stall any services that depend on it until you're sure
> that the service has started properly.  It sounds like that's the concern
> here.  (If that isn't a concern, then I think the correct answer to that
> bug report is that it's not a bug, just a misunderstanding of what the
> return status of systemctl means.)

The concern is that the service can fail without answering a request,
but I don't think any service would strictly depend on nscd.  Even so,
in the general sense I guess it would be prudent to assume that
another service may choose to depend on nscd being initialized
properly and would like to be notified if it doesn't.  Whether that
should be the default is up for debate.

> You can "fix" this by converting it to Type=forking, but that's only
> because you're changing the readiness protocol to wait for the process to
> create a PID file.  Failures before creation of the PID file will
> therefore be detected... but at the cost of using a more complex service
> startup and having to carry a PID file around that isn't actually
> necessary.  Note, though, that if you don't use a PID file and the
> PIDFile= option, you're still left with the same problem if nscd fails
> after forking but before actually being ready to answer requests.  (Now,
> it's possible -- I've not checked -- that nscd is very careful to not exit
> in the parent process until the child process is ready to answer
> connections, in which case that readiness protocol would work.  But that's
> tricky to do properly and a lot of internal work.)

nscd does not care what the child process does; it simply forks and
exits.  The PID file is created after parsing options and
configuration, so service can still fail after that due to other
reasons, like a corrupt database, failure to start its worker threads,
etc. and not actually be ready.  It shouldn't be too difficult to
delay the PID file creation to a point where nscd considers itself
ready.

> The best fix from a systemd perspective is to not use either of these
> service types and instead use Type=notify along with the sd_notify(3) API
> to clearly inform systemd when the daemon is actually ready to answer
> requests.  This would produce the correct behavior in this case without
> requiring reverting the service type to the forking model.  This is
> exactly the problem for which the sd_notify(3) interface was created.

All other options seem like a lot more additional work and I am not
convinced that nscd deserves that kind of love right now.  Also,
trying to make nscd a notify type service seems like hard-coding it
for nscd and I am not convinced about it, given that there's an
alternative that serves the purpose.

Siddhesh

Attachment: pgpZypv0a8SGp.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]