Bug 2359 - testRefreshZombie(frysk.proc.TestRefresh)junit.framework.AssertionFailedError: event loop run explictly stopped (waiting for ack)
Summary: testRefreshZombie(frysk.proc.TestRefresh)junit.framework.AssertionFailedError...
Status: NEW
Alias: None
Product: frysk
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
: P1 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on: 2430 2431 2432
Blocks: 2081
  Show dependency treegraph
 
Reported: 2006-02-19 19:30 UTC by Andrew Cagney
Modified: 2008-10-21 21:40 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Cagney 2006-02-19 19:30:57 UTC
RHEL-4 system

testRefreshZombie(frysk.proc.TestRefresh)junit.framework.AssertionFailedError:
event loop run explictly stopped (waiting for ack)
   at _ZN4java4lang11VMThrowable16fillInStackTraceEPNS0_9ThrowableE
(/usr/lib/libgcj.so.6)
   at _ZN4java4lang9Throwable16fillInStackTraceEv (/usr/lib/libgcj.so.6)
   at _ZN4java4lang9ThrowableC1EPNS0_6StringE (/usr/lib/libgcj.so.6)
   at _ZN4java4lang5ErrorC1EPNS0_6StringE (/usr/lib/libgcj.so.6)
   at 0x080afcfc (Unknown Source)
   at 0x080aedc1 (Unknown Source)
   at 0x080aecc0 (Unknown Source)
   at 0x0807fbbe (Unknown Source)
   at 0x0807fa8a (Unknown Source)
   at 0x0807f2f5 (Unknown Source)
   at 0x0807f395 (Unknown Source)
   at 0x08082618 (Unknown Source)
   at ffi_call_SYSV (/usr/lib/libgcj.so.6)
   at ffi_call (/usr/lib/libgcj.so.6)
   at
_Z18_Jv_CallAnyMethodAPN4java4lang6ObjectEPNS0_5ClassEP10_Jv_MethodbbP6JArrayIS4_EP6jvalueSB_bS4_
(/usr/lib/libgcj.so.6)
   at
_Z18_Jv_CallAnyMethodAPN4java4lang6ObjectEPNS0_5ClassEP10_Jv_MethodbP6JArrayIS4_EPS7_IS2_ES4_
(/usr/lib/libgcj.so.6)
   at _ZN4java4lang7reflect6Method6invokeEPNS0_6ObjectEP6JArrayIS4_E
(/usr/lib/libgcj.so.6)
   at 0x080ae17e (Unknown Source)
   at 0x080adf96 (Unknown Source)
   at 0x080afcb4 (Unknown Source)
   at 0x080ae9cd (Unknown Source)
   at 0x080ae937 (Unknown Source)
   at 0x080adf64 (Unknown Source)
   at 0x080add7d (Unknown Source)
   at 0x080add3f (Unknown Source)
   at 0x080add7d (Unknown Source)
   at 0x080add3f (Unknown Source)
   at 0x080ac3b1 (Unknown Source)
   at 0x080ac34e (Unknown Source)
   at 0x0809a307 (Unknown Source)
   at 0x0807939e (Unknown Source)
   at _ZN3gnu4java4lang10MainThread9call_mainEv (/usr/lib/libgcj.so.6)
   at _ZN3gnu4java4lang10MainThread3runEv (/usr/lib/libgcj.so.6)
   at _Z13_Jv_ThreadRunPN4java4lang6ThreadE (/usr/lib/libgcj.so.6)
   at _Z11_Jv_RunMainP14_Jv_VMInitArgsPN4java4lang5ClassEPKciPS6_b
(/usr/lib/libgcj.so.6)
   at _Z11_Jv_RunMainPN4java4lang5ClassEPKciPS4_b (/usr/lib/libgcj.so.6)
   at JvRunMain (/usr/lib/libgcj.so.6)
   at 0x08079344 (Unknown Source)
   at __libc_start_main (/lib/tls/libc.so.6)
   at 0x08079289 (Unknown Source)
Comment 1 Andrew Cagney 2006-02-19 21:58:49 UTC
- stracing TestRunner makes the problem go away
- enabling logging makes the problem go away
Comment 2 Andrew Cagney 2006-02-19 22:02:28 UTC
- on a mono-processor, this rarely happens
- on an smp, this always always happens

With SMP, after a fork(), both the parent and child will run free and in
parallel.  On a mono-processor, only one will run -> strongly suggests some sort
of race condition.

The other possability is that the child is being hit by a signal while it is
trying to find it's feet.
Comment 3 Andrew Cagney 2006-02-19 22:04:26 UTC
The system call sequence is: vfork -> fork -> exec.  This is from just adding
print statements:

Running testRefreshZombie(frysk.proc.TestRefresh) ...zombie test started
program /home/cagney/native/frysk-core/frysk/pkglibexecdir/funit-child
v 24952 pid 24953 status 0x0
child pid 24953
zombie created
FAIL
  junit.framework.AssertionFailedError: event loop run explictly stopped
(waiting for ack)

<<program ..>> was printed by frysk.sys.Fork.spawn just before the exec call,
which strongly suggests that the call sequence (vfork -> fork -> exec)
succeeded, but the final exec killed the entire process.
Comment 4 Andrew Cagney 2006-02-23 20:52:06 UTC
Index: frysk-imports/tests/ChangeLog
2006-02-23  Andrew Cagney  <cagney@redhat.com>

        * Makefile.am (vfork_exec_vfork_exec_SOURCES, noinst_PROGRAMS)
        (TESTS): Add vfork-exec/vfork-exec.c.
        * vfork-exec/vfork-exec.c: New test.
Comment 5 Andrew Cagney 2006-03-07 21:37:18 UTC
Index: frysk-core/frysk/proc/ChangeLog
This detects the problem, and cleans up the mess; it doesn't yet fix it.

2006-03-07  Andrew Cagney  <cagney@redhat.com>

        * TestLib.java: Check for still pending signals.

Index: frysk-sys/frysk/sys/ChangeLog
2006-03-07  Andrew Cagney  <cagney@redhat.com>

        * Poll.java (poll): Add description.
        * SigSet.java, cni/SigSet.cxx: Change all void methods to return
        this SigSet.

2006-03-06  Andrew Cagney  <cagney@redhat.com>

        * SigSet.java (getPending, suspend, blockProcMask)
        (unblockProcMask, setProcMask, getProcMask): Add.
        * cni/SigSet.cxx: Ditto.
        * TestSigSet.java (testProcMask): New test.

        * cni/SigSet.hxx, cni/SigSet.cxx, SigSet.java, TestSigSet.java:
        New files.
Comment 6 Andrew Cagney 2006-03-07 22:12:23 UTC
This fixes TestRefresh where there was a possibly dangling signal:

Index: frysk-core/frysk/proc/ChangeLog
2006-03-07  Andrew Cagney  <cagney@redhat.com>

        * TestRefresh.java (testExitLoosesChild): Replace
        testExitLoosesAllChildren, only create one child process.
Comment 7 Andrew Cagney 2006-03-07 22:15:20 UTC
Remaining cases of dangling signals turned into separate bugs.