RHEL-4 system testRefreshZombie(frysk.proc.TestRefresh)junit.framework.AssertionFailedError: event loop run explictly stopped (waiting for ack) at _ZN4java4lang11VMThrowable16fillInStackTraceEPNS0_9ThrowableE (/usr/lib/libgcj.so.6) at _ZN4java4lang9Throwable16fillInStackTraceEv (/usr/lib/libgcj.so.6) at _ZN4java4lang9ThrowableC1EPNS0_6StringE (/usr/lib/libgcj.so.6) at _ZN4java4lang5ErrorC1EPNS0_6StringE (/usr/lib/libgcj.so.6) at 0x080afcfc (Unknown Source) at 0x080aedc1 (Unknown Source) at 0x080aecc0 (Unknown Source) at 0x0807fbbe (Unknown Source) at 0x0807fa8a (Unknown Source) at 0x0807f2f5 (Unknown Source) at 0x0807f395 (Unknown Source) at 0x08082618 (Unknown Source) at ffi_call_SYSV (/usr/lib/libgcj.so.6) at ffi_call (/usr/lib/libgcj.so.6) at _Z18_Jv_CallAnyMethodAPN4java4lang6ObjectEPNS0_5ClassEP10_Jv_MethodbbP6JArrayIS4_EP6jvalueSB_bS4_ (/usr/lib/libgcj.so.6) at _Z18_Jv_CallAnyMethodAPN4java4lang6ObjectEPNS0_5ClassEP10_Jv_MethodbP6JArrayIS4_EPS7_IS2_ES4_ (/usr/lib/libgcj.so.6) at _ZN4java4lang7reflect6Method6invokeEPNS0_6ObjectEP6JArrayIS4_E (/usr/lib/libgcj.so.6) at 0x080ae17e (Unknown Source) at 0x080adf96 (Unknown Source) at 0x080afcb4 (Unknown Source) at 0x080ae9cd (Unknown Source) at 0x080ae937 (Unknown Source) at 0x080adf64 (Unknown Source) at 0x080add7d (Unknown Source) at 0x080add3f (Unknown Source) at 0x080add7d (Unknown Source) at 0x080add3f (Unknown Source) at 0x080ac3b1 (Unknown Source) at 0x080ac34e (Unknown Source) at 0x0809a307 (Unknown Source) at 0x0807939e (Unknown Source) at _ZN3gnu4java4lang10MainThread9call_mainEv (/usr/lib/libgcj.so.6) at _ZN3gnu4java4lang10MainThread3runEv (/usr/lib/libgcj.so.6) at _Z13_Jv_ThreadRunPN4java4lang6ThreadE (/usr/lib/libgcj.so.6) at _Z11_Jv_RunMainP14_Jv_VMInitArgsPN4java4lang5ClassEPKciPS6_b (/usr/lib/libgcj.so.6) at _Z11_Jv_RunMainPN4java4lang5ClassEPKciPS4_b (/usr/lib/libgcj.so.6) at JvRunMain (/usr/lib/libgcj.so.6) at 0x08079344 (Unknown Source) at __libc_start_main (/lib/tls/libc.so.6) at 0x08079289 (Unknown Source)
- stracing TestRunner makes the problem go away - enabling logging makes the problem go away
- on a mono-processor, this rarely happens - on an smp, this always always happens With SMP, after a fork(), both the parent and child will run free and in parallel. On a mono-processor, only one will run -> strongly suggests some sort of race condition. The other possability is that the child is being hit by a signal while it is trying to find it's feet.
The system call sequence is: vfork -> fork -> exec. This is from just adding print statements: Running testRefreshZombie(frysk.proc.TestRefresh) ...zombie test started program /home/cagney/native/frysk-core/frysk/pkglibexecdir/funit-child v 24952 pid 24953 status 0x0 child pid 24953 zombie created FAIL junit.framework.AssertionFailedError: event loop run explictly stopped (waiting for ack) <<program ..>> was printed by frysk.sys.Fork.spawn just before the exec call, which strongly suggests that the call sequence (vfork -> fork -> exec) succeeded, but the final exec killed the entire process.
Index: frysk-imports/tests/ChangeLog 2006-02-23 Andrew Cagney <cagney@redhat.com> * Makefile.am (vfork_exec_vfork_exec_SOURCES, noinst_PROGRAMS) (TESTS): Add vfork-exec/vfork-exec.c. * vfork-exec/vfork-exec.c: New test.
Index: frysk-core/frysk/proc/ChangeLog This detects the problem, and cleans up the mess; it doesn't yet fix it. 2006-03-07 Andrew Cagney <cagney@redhat.com> * TestLib.java: Check for still pending signals. Index: frysk-sys/frysk/sys/ChangeLog 2006-03-07 Andrew Cagney <cagney@redhat.com> * Poll.java (poll): Add description. * SigSet.java, cni/SigSet.cxx: Change all void methods to return this SigSet. 2006-03-06 Andrew Cagney <cagney@redhat.com> * SigSet.java (getPending, suspend, blockProcMask) (unblockProcMask, setProcMask, getProcMask): Add. * cni/SigSet.cxx: Ditto. * TestSigSet.java (testProcMask): New test. * cni/SigSet.hxx, cni/SigSet.cxx, SigSet.java, TestSigSet.java: New files.
This fixes TestRefresh where there was a possibly dangling signal: Index: frysk-core/frysk/proc/ChangeLog 2006-03-07 Andrew Cagney <cagney@redhat.com> * TestRefresh.java (testExitLoosesChild): Replace testExitLoosesAllChildren, only create one child process.
Remaining cases of dangling signals turned into separate bugs.