This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

child reaping race conditions with system()


Greetings,

Some time ago, I came across something that might be a
bug (I'm not sure at all).
Suppose two threads (NPTL) try to reap the same child process
using waitpid. Naturally there will be a race condition -
which thread reaps the child and retrieves the status will be
undefined. But what about the system() call (stdlib.h)?
It's not thread-safe but it's said to *block* SIGCHLD in
order to prevent the application from catching the signal
early, effectively preventing race conditions in case the
signal handler would try to reap children (e.g. any child).
Otherwise this could be a problem even in single-threaded
programs.
glibc's current implementation (also eglibc's and uclibc's)
uses the sigprocmask syscall to mask out SIGCHLD temporarily.
This works fine in single-threaded environments, but
not if multiple threads are involved - since sigprocmask
will only alter the calling thread's signal mask
(apparently pthread_sigmask calls the same syscall as
sigprocmask) while NPTL/POSIX.1 threads deliver signals
on a per process basis to any thread that hasn't masked out
the signal. This may result in race conditions if you do
concurrent child reaping in your multi-threaded application
with one thread executing system() and the other one reaping
children (i.e. any child).
The attached test program demonstrates that point. system()
invoked in the thread will most of the time fail and errno
contain the "No child processes" code while the child
in question will have been reaped by the signal handler.
This happens on NPTL uClibc (master branch,
revision 0d6ee549bc86fd330672a79d9a87d2c3825eea67) as well as
on glibc. Most likely also on eglibc which uclibc borrowed
a lot of code from including the system() function.
The workarounds for this behaviour are obvious (mask out
the signal in all threads; make sure
the same child is not reaped concurrently; allow reaping the
the same child in multiple threads but synchronize the child
reaping; etc).
My question is simply: Is this desired behaviour of the system()
function? How to interpret the POSIX specification? It's
neither affirmed nor denied that it works in multi-threaded
environments (as long as the system() calls themselves are not
concurrent). Is there even some obvious way to make it work
(I don't know of any), e.g. for some user-written system()
replacement?
I've already asked on the uclibc mailing list and got
no feedback on the matter at all.
So I hoped, maybe you could help out.

best regards,
Robin Haberkorn

-- 
-- 
------------------ managed broadband access ------------------

Travelping GmbH               phone:           +49-391-8190990
Roentgenstr. 13               fax:           +49-391-819099299
D-39108 Magdeburg             email:       info@travelping.com
GERMANY                       web:   http://www.travelping.com


Company Registration: Amtsgericht Stendal Reg No.:   HRB 10578
Geschaeftsfuehrer: Holger Winkelmann | VAT ID No.: DE236673780
--------------------------------------------------------------

Attachment: childrace2.c.gz
Description: GNU Zip compressed data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]