This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Patch to correct the refrence count of fork handler in child process


Hi,

While debugging a hang in one of our applications I came across the following bug in NPTL library.

The hang occurs in a child process spawn by a multi-threaded application. The application has been linked to a library which has registered "pthread_atfork()" handlers. The child processes hang indefinitely at __unregister_atfork() while exiting. These hanging processes have only one thread of execution, but they were waiting for the fork handler count to become 0, from a number which was greater than 2.

eg:

[root@c108bc2b01 ~]# strace -p 16603
Process 16603 attached - interrupt to quit
futex(0x800730, FUTEX_WAIT, 8, NULL <unfinished ...>
Process 16603 detached
[root@c108bc2b01 ~]# strace -p 16605
Process 16605 attached - interrupt to quit
futex(0x800730, FUTEX_WAIT, 7, NULL <unfinished ...>


This happens with the following scenario :


When mulitple threads in a process call fork () (simultaneously), they might end up in bumping up the fork handler ref. counts to a value > 2 (depending on the number of threads and the timings). The child process forked thus would have an abnormal fork handler refcount which would never drop back to 0 and thus wait indefinitely while doing an __unregister_atfork().

We have verified the attached patch, where we reset the fork-handler refcounts to 1 for the child process. It is safe to do that, since we are the only thread in the process.


--- libc.orig/nptl/sysdeps/unix/sysv/linux/fork.c 2003-12-20 15:37:13.000000000 -0800
+++ libc/nptl/sysdeps/unix/sysv/linux/fork.c 2007-02-23 12:24:02.000000000 -0800
@@ -167,8 +167,12 @@
allp->handler->child_handler ();


          /* Note that we do not have to wake any possible waiter.
-            This is the only thread in the new process.  */
-         --allp->handler->refcntr;
+            This is the only thread in the new process.
+            The count may have been bumped up by other threads doing
+            a fork. We reset it to 1, to avoid waiting for non-existing
+            thread(s) to release the count.
+          */
+         allp->handler->refcntr = 1;



Comments are more than welcome !

If it looks correct, could you please push this upstream ?


Thanks,


Suzuki K P
Linux Technology Center,
IBM

nptl/sysdeps/unix/sysv/linux/fork.c : __libc_fork()

Reset the refcntr for the fork-handlers in the child process to 1.

	The refcntr may have been bumped up by the *OTHER* thread(s) doing a fork() in the parent process. This would leave the child with an incorrect refcntr for the fork-handlers leading us to wait for non-existing threads to drop the refcntr while doing __unregister_atfork(). Since there is only one thread in the child process, we reset it to 1, to reflect the actual count.

Index: libc/nptl/sysdeps/unix/sysv/linux/fork.c
===================================================================
--- libc.orig/nptl/sysdeps/unix/sysv/linux/fork.c	2003-12-20 15:37:13.000000000 -0800
+++ libc/nptl/sysdeps/unix/sysv/linux/fork.c	2007-02-23 12:25:45.000000000 -0800
@@ -167,8 +167,12 @@
 	    allp->handler->child_handler ();
 
 	  /* Note that we do not have to wake any possible waiter.
-	     This is the only thread in the new process.  */
-	  --allp->handler->refcntr;
+	     This is the only thread in the new process.
+	     The count may have been bumped up by the other threads in 
+	     the parent, doing a fork. We reset it to 1, to avoid waiting
+	     for non-existing thread(s) to release the count.
+	   */
+	  allp->handler->refcntr = 1;
 
 	  /* XXX We could at this point look through the object pool
 	     and mark all objects not on the __fork_handlers list as

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]