This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH v2] test-skeleton: Kill any child process's offspring


On Mon, 10 Feb 2014, Ondřej Bílka wrote:

> friendly ping
> On Fri, Oct 11, 2013 at 06:33:32PM +0200, Ondřej Bílka wrote:
> > On Mon, Sep 23, 2013 at 09:24:36PM +0100, Maciej W. Rozycki wrote:
> > > Hi,
> > > 
> > >  Some of the test cases driven by our test-skeleton create subprocesses.  
> > > In a test scenario involving what looks to me like a Linux kernel bug I 
> > > have observed stray processes of this kind left behind where their parent 
> > > was killed by test-skeleton after a timeout however the process in 
> > > question was not.  This leads to a clutter in the environment our test 
> > > suite is run within and on weaker systems may also affect test results 
> > > where the processing power consumed by such a leftover causes the lack of 
> > > same for subsequent test cases and consequently timeouts.
> > > 
> > >  The cause of leaving such processes behind is signal_handler that calls 
> > > kill to send SIGKILL to test-skeleton's immediate child only and not any 
> > > further descendants.  Conveniently as the child starts it places itself in 
> > > a separate process group that is then inherited by any further children.  
> > > Therefore a simple if not obvious fix to this problem is to send SIGKILL 
> > > to the child's process group instead which this change implements.  There 
> > > is no need to wait for any additional descendants left as the exit status 
> > > of the immediate child is enough for our purposes and init(8) will take 
> > > care of the rest in the unlikely case this change addresses.
> > > 
> > >  The problem itself was observed with nptl/tst-mutexpi9 and that this fix 
> > > makes go away -- no stray processes after the test suite has terminated 
> > > anymore (nptl/tst-mutexpi9 has its own problem of course on this target, 
> > > but that's another matter; I think there's a value in making the test 
> > > suite itself more robust).
> > > 
> > >  OK to apply?
> > > 
> > What is status of this patch?

 Here's a version updated according to concerns expressed.  Code now 
checks for an error from setpgid and in the unlikely event of one it 
retains the current behaviour and kills itself only rather than the 
process group.

 The issue with nptl/tst-mutexpi9 has since gone for one reason or 
another, so this has been only lightly tested, making sure that another 
test case observed to time out kills itself.

 OK to apply?

2014-06-23  Maciej W. Rozycki  <macro@codesourcery.com>

	* test-skeleton.c (signal_handler): Undo any PID negation made 
	after setpgid.
	(main): Negate the PID if setpgid succeeded, otherwise report an
	error.

  Maciej

glibc-test-skeleton-kill-pgid.diff
Index: glibc-fsf-trunk-quilt/test-skeleton.c
===================================================================
--- glibc-fsf-trunk-quilt.orig/test-skeleton.c	2014-06-17 16:03:16.621929397 +0100
+++ glibc-fsf-trunk-quilt/test-skeleton.c	2014-06-21 00:03:23.141622346 +0100
@@ -138,8 +138,9 @@ signal_handler (int sig __attribute__ ((
   int killed;
   int status;
 
-  /* Send signal.  */
+  /* Send signal.  Undo any PID negation made after setpgid.  */
   kill (pid, SIGKILL);
+  pid = abs (pid);
 
   /* Wait for it to terminate.  */
   int i;
@@ -341,8 +342,14 @@ main (int argc, char *argv[])
 #endif
 
       /* We put the test process in its own pgrp so that if it bogusly
-	 generates any job control signals, they won't hit the whole build.  */
-      setpgid (0, 0);
+	 generates any job control signals, they won't hit the whole build.
+	 Negate the PID if successful so that kill targets the PGID, to
+	 reach any child process's offspring.  */
+      status = setpgid (0, 0);
+      if (status == 0)
+	pid = -pid;
+      else
+	printf ("Failed to set the process group ID: %m\n");
 
       /* Execute the test function and exit with the return value.   */
       exit (TEST_FUNCTION);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]