This is the mail archive of the frysk@sources.redhat.com mailing list for the frysk project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fc6 frysk-core failures.


[Sorry you keep getting multiple copies of my notes--I keep forgetting
to send them as text-only rather than text/html and frysk@ bounces them.]

Andrew,

Did a bit more tinkering in TestLib.java:tearDown()  (see the attached
Java) that looks like it conforms a bit better to how the new ptrace
works, but it still doesn't work properly.  In the final drain loop,
neither disappeared() nor terminated() ever get called, so the loop
never terminates--the best the preceding kill 'em all loop can do is get
all the processes into a "stopped" state.  (See the log tail in the
attached p2.txt.)  The Ptrace.cont (pid, 0); in the kill loop is an
attempt to force stopped processes to continue, but I don't thing it's
happening.  (Passing a sig of zero to the new ptrace, BTW, effectively
bypasses the signal injection step and goes straight to whatever ptrace
request you're doing.  If you pass a non-zero sig, and the signal
injection fails, the ptrace req never happens.  I'm not sure if this is
a bug or a feature.)

A question just occurred to me: I think, in this test, all of the
various pids shown in the log are threads rather than forks.  (Is that
right?)  If so, /do/ pthreads terminate?   Or do they just hang around
waiting to be joined or exited?

At the end of the test, ps -el shows:

    0 S   500  2537 29592  2  78  0 - 15029 wait   pts/8  00:00:00
    TestRunner
    0 T   500  2540  2537  0  81  0 -  8131 utrace ?      00:00:00
    funit-child


And so, to bed,
Chris

    public void tearDown ()
    {
	logger.log (Level.FINE, "{0} >>>>>>>>>>>>>>>> start tearDown\n", this);

	// Check that there are no pending signals that should have
	// been drained as part of testing.  Do this <em>before</em>
	// any tasks are killed off so that the check isn't confused
	// by additional signals generated by the dieing tasks.
	Sig[] checkSigs = new Sig[] { Sig.USR1, Sig.USR2 };
	SigSet pendingSignals = new SigSet ().getPending ();
	for (int i = 0; i < checkSigs.length; i++) {
	    Sig sig = checkSigs[i];
	    assertFalse ("pending signal " + sig,
			 pendingSignals.contains (sig));
	}

	// Kill off all the registered children.  Once that signal is
	// processed the task will die.
	// Make sure there are still children to kill. Someone else
	// may have waited on their deaths already.
	
	for (Iterator i = children.iterator (); i.hasNext (); ) {
	    Integer child = (Integer) i.next ();
	    int pid = child.intValue ();

	    // try to continue with a 0 sig
	    try {
		Ptrace.cont (pid, 0);
		logger.log (Level.FINE, "{0} continue 0 {1,number,integer}\n",
			    new Object[] { this, child });

		// okay, continue worked, so send a kill then detach
		Ptrace.detach (pid, Sig.KILL);
		logger.log (Level.FINE, "{0} detach -KILL {1,number,integer}\n",
			    new Object[] { this, child });
	    }
	    catch (Errno.Esrch esrch) {
		logger.log (Level.FINE,
			    "{0} continue 0 {1,number,integer} (failed ESRCH)\n",
			    new Object[] { this, child });

		try {
		    Signal.kill (pid, Sig.KILL);
		    logger.log (Level.FINE, "{0} kill -KILL {1,number,integer}\n",
				new Object[] { this, child });
		}
		catch (Errno.Esrch esrch2) {
		    // Toss it.
		    logger.log (Level.FINE,
				"{0} kill -KILL {1,number,integer} (failed)\n",
				new Object[] { this, child });
		    //   i.remove();
		}
	    }
	}

	// Drain the wait event queue.  This ensures that: there are
	// no outstanding events to confuse the next test run; all
	// child zombies have been reaped (and eliminated); and
	// finally makes certain that all attached tasks have been
	// terminated.
	//
	// For attached tasks, which will generate non-exit wait
	// events (clone et.al.), the task is detached / killed.
	// Doing that frees up the task so that it can run to exit.
	try {
	    while (!children.isEmpty()) {
		logger.log (Level.FINE,
			    "{0} starting loop waitAll\n",
			    new Object[] {
				TestLib.this
			    });
	    	Wait.waitAll (-1, new Wait.Observer ()
		    {
			private void detach (int pid)
			{
                            try {
                                // Detach with a KILL signal which
                                // will force the task to exit.
                                Ptrace.detach (pid, Sig.KILL);
                                logger.log (Level.FINE,
                                            "{0} detach -KILL {1,number,integer}\n",
                                            new Object[] {
                                                TestLib.this,
                                                new Integer (pid)
                                            });
                            }
                            catch (Errno.Esrch e) {
                                logger.log (Level.FINE,
                                            "{0} detach -KILL {1,number,integer} (fail)\n",
                                            new Object[] {
                                                TestLib.this,
                                                new Integer (pid)
                                            });
                            }
                        }
			public void cloneEvent (int pid, int clone)
			{
			    logger.log (Level.FINE,
					"{0} cloneEvent {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    detach (pid);
			}
			public void forkEvent (int pid, int child)
			{
			    logger.log (Level.FINE,
					"{0} forkEvent {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    detach (pid);
			}
			public void exitEvent (int pid, boolean signal,
					       int value, boolean coreDumped)
			{
			    logger.log (Level.FINE,
					"{0} exitEvent {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    detach (pid);
			    // Do not remove PID from children list;
			    // need to let the terminated event behind
			    // it bubble up.
			}
			public void execEvent (int pid)
			{
			    logger.log (Level.FINE,
					"{0} execEvent {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    detach (pid);
			}
			public void syscallEvent (int pid)
			{
			    logger.log (Level.FINE,
					"{0} syscallEvent {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    detach (pid);
			}
			public void stopped (int pid, int signal)
			{
			    logger.log (Level.FINE,
					"{0} stopped {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    try {
				Signal.kill (pid, Sig.KILL);
				logger.log (Level.FINE,
					    "{0} stopped kill -KILL {1,number,integer}\n",
					    new Object[] {
						TestLib.this,
						new Integer (pid)
					    });
			    }
			    catch (Errno.Esrch e) {
				// Toss it.
				logger.log (Level.FINE,
					    "{0} stopped kill -KILL {1,number,integer} (failed)\n",
					    new Object[] {
						TestLib.this,
						new Integer (pid)
					    });
			    }
			}
			public void terminated (int pid, boolean signal,
						int value, boolean coreDumped)
			{
			    logger.log (Level.FINE,
					"{0} terminated {1,number,integer}\n",
					new Object[] {
					    TestLib.this,
					    new Integer (pid)
					});
			    // Hopefully done with this PID.
			    children.remove(new Integer(pid));
			    // To be sure, again make certain that the
			    // thread is detached.
			    detach (pid);
			    // True children can have a second exit
			    // status behind this first one, drain
			    // that also.  Give up when this PID has
			    // no outstanding events.
			    try {
				while (true) {
				    Wait.waitAll (pid,
						  new IgnoreWaitObserver ());
				    logger.log (Level.FINE,
						"{0} waitAll (pid) ok\n",
						new Integer (pid));
				}
			    }
			    catch (Errno.Echild e) {
				logger.log (Level.FINE,
					    "{0} waitAll (ECHLD)\n",
					    new Integer (pid));
			    }
			}
			public void disappeared (int pid, Throwable w)
			{
			    logger.log (Level.FINE,
					"{0} disappeared (pid) ok\n",
					new Object[] {
					    TestLib.this
					});
			    detach (pid);
			    children.remove(new Integer(pid));
			}
		    });
  	    }
	}
        catch (Errno.Echild e) {
	    // No more events.
        }

	// Remove any stray files.
	deleteTmpFiles ();

	// Drain the set of pending signals.  Note that the process of
	// killing off the processes used in the test can generate
	// extra signals - for instance a SIGUSR1 from a detached
	// child that notices that it's parent just exited.
	class SignalDrain
	    implements Poll.Observer
	{
	    SigSet pending = new SigSet ();
	    public void signal (Sig sig) { pending.add (sig); }
	    public void pollIn (int in) { }

	}
	SignalDrain signalDrain = new SignalDrain ();
	Poll.poll (signalDrain, 0);

	logger.log (Level.FINE, "{0} >>>>>>>>>>>>>>>> end tearDown\n", this);
    }

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) >>>>>>>>>>>>>>>> start tearDown

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) continue 0 2518 (failed ESRCH)

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) kill -KILL 2518

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) continue 0 2517 (failed ESRCH)

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) kill -KILL 2517

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) continue 0 2516 (failed ESRCH)

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) kill -KILL 2516

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) continue 0 2515 (failed ESRCH)

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) kill -KILL 2515

14-Sep-06 3:01:44 AM frysk.proc.TestLib tearDown
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) starting loop waitAll

14-Sep-06 3:01:44 AM frysk.sys.Wait log
FINE: frysk.sys.Wait pid -1 entering waitpid(..., _WALL)

14-Sep-06 3:01:44 AM frysk.sys.Wait log
FINE: frysk.sys.Wait pid 2515 status 0x97f

14-Sep-06 3:01:44 AM frysk.proc.TestLib$10 stopped
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) stopped 2515

14-Sep-06 3:01:44 AM frysk.proc.TestLib$10 stopped
FINE: testManyExistingThreadAttached(frysk.proc.TestProcTasksObserver) stopped kill -KILL 2515

14-Sep-06 3:01:44 AM frysk.sys.Wait log
FINE: frysk.sys.Wait pid 2515 entering waitpid(..., _WALL)

2515.2517: received signal 14 (Alarm clock)
2515.2517: exit

Attachment: signature.asc
Description: PGP signature

Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]