This is the mail archive of the
glibc-bugs@sources.redhat.com
mailing list for the glibc project.
[Bug nptl/1148] New: Race condition between fork() and exit() when using pthread_atfork() from a shared library
- From: "asuffield at debian dot org" <sourceware-bugzilla at sources dot redhat dot com>
- To: glibc-bugs at sources dot redhat dot com
- Date: 2 Aug 2005 00:56:33 -0000
- Subject: [Bug nptl/1148] New: Race condition between fork() and exit() when using pthread_atfork() from a shared library
- Reply-to: sourceware-bugzilla at sources dot redhat dot com
This is Debian bug #223110, still present in Debian release 2.3.5-2. I've looked
through the cvs changelogs and the relevant code (it looks broken to me) but not
tried it against a current cvs build yet.
I've attached this to nptl but it probably afflicts linuxthreads too; see below.
Minimised test case:
--8<--- foo.c ----------------
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/types.h>
#include <stdlib.h>
void exit_on_signal(int signr)
{
fprintf(stderr, "Exiting on signal from child\n");
exit(0);
}
extern void foo(void);
int main(void)
{
foo();
signal(SIGUSR2,exit_on_signal);
pid_t parent = getpid();
if (fork() == 0)
kill(parent, SIGUSR2);
else
sleep(10);
return 0;
}
------------------------------
--8<--- libfoo.c -------------
#include <pthread.h>
void
do_prepare(void)
{
}
void
do_child(void)
{
}
void
foo(void)
{
pthread_atfork(&do_prepare, NULL, &do_child);
}
------------------------------
gcc -shared -o libfoo.so libfoo.c -pthread
gcc -o foo foo.c -L. -lfoo
LD_LIBRARY_PATH=. ./foo
This program should exit, but it hangs instead inside exit() (race condition,
but I've never had it avoid hanging on a 2.6 kernel). Interestingly enough, it
doesn't appear to be specific to nptl, in that it also hangs with linuxthreads -
but the rest of this mail deals with the nptl version; I haven't investigated
what's going on with linuxthreads.
Here's my analysis of the problem (dates from libc 2.3.2, but I don't think
anything significant has changed):
Enter main()
-> Enter foo()
-> pthread_atfork() registers the handlers (it doesn't matter
which ones are present; I think three NULLs will still break),
and associates them with libfoo.so. refcntr on this handler is
initialised to 1
-> fork()
-> Enter __libc_fork() (in nptl/sysdeps/unix/sysv/linux/fork.c)
-> Call do_prepare()
-> Increment refcntr on the atfork handler (refcntr == 2)
-> Invoke the fork syscall
child -> Call do_child()
-> Decrement refcntr on the atfork handler (refcntr == 1)
-> Send signal SIGUSR2 to the parent
-> Exit
parent -> Enter exit_on_signal()
-> Enter exit()
...
-> Unload libfoo
-> Call __unregister_atfork() for libfoo (in
nptl/sysdeps/unix/sysv/linux/unregister-atfork.c)
-> Decrement refcntr on the atfork handler (refcntr == 1)
-> Wait for refcntr to reach zero
This condition will never be true. __libc_fork() incremented refcntr
on the atfork handler, but will never decrement it because in order
for that to happen, the signal handler would have to return, which
would require exit() to return. __unregister_atfork() will hang
waiting for this variable to reach zero.
Note that the parent never woke up from the fork syscall until after
the child had sent the signal. This is a race condition; the child
must send the signal almost right away.
--
Summary: Race condition between fork() and exit() when using
pthread_atfork() from a shared library
Product: glibc
Version: 2.3.5
Status: NEW
Severity: normal
Priority: P2
Component: nptl
AssignedTo: drepper at redhat dot com
ReportedBy: asuffield at debian dot org
CC: glibc-bugs at sources dot redhat dot com
http://sources.redhat.com/bugzilla/show_bug.cgi?id=1148
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.