4389 – ptrace in event-loop causes TestBreakpoints inferior to sig-seg-v

Bug 4389 - ptrace in event-loop causes TestBreakpoints inferior to sig-seg-v

Summary: ptrace in event-loop causes TestBreakpoints inferior to sig-seg-v

Status:	RESOLVED FIXED

Alias:	None

Product:	frysk
Classification:	Unclassified
Component:	general (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Andrew Cagney

URL:
Keywords:

Depends on:
Blocks:	1522
	Show dependency tree / graph

Reported:	2007-04-17 22:54 UTC by Andrew Cagney
Modified:	2007-04-19 17:03 UTC (History)
CC List:	0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Cagney 2007-04-17 22:54:54 UTC

Here is a trace of TestBreakpoints under a merged ptrace/event-loop:


FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
notifyCodeBreakpoint(134,514,608)

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceTaskState$Running handleSignaledEvent
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
handleSignaledEvent, signal: 5

17-Apr-07 5:36:20 PM frysk.proc.Task notifySignaled
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
notifySignaled(int)

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceTask sendContinue
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
sendContinue

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceWaitBuilder getTask
FINE: {TaskId,1569} stopped

17-Apr-07 5:36:20 PM frysk.proc.Host get
FINE: {frysk.proc.LinuxPtraceHost@17f990,state=running} get TaskId

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceTaskState$Running handleSignaledEvent
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
handleSignaledEvent, signal: 11

17-Apr-07 5:36:20 PM frysk.proc.Task notifySignaled
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
notifySignaled(int)

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceTask sendContinue
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1569,tid=1569,state=running}
sendContinue

17-Apr-07 5:36:20 PM frysk.proc.LinuxPtraceWaitBuilder getTask
FINE: {TaskId,1569} exitEvent


and here is the same trace, when the ptrace server is being used:


FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1563,tid=1563,state=running}
notifyCodeBreakpoint(134,514,608)

17-Apr-07 5:36:11 PM frysk.proc.LinuxPtraceTaskState$Running handleSignaledEvent
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1563,tid=1563,state=running}
handleSignaledEvent, signal: 5

17-Apr-07 5:36:11 PM frysk.proc.Task notifySignaled
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1563,tid=1563,state=running}
notifySignaled(int)

17-Apr-07 5:36:11 PM frysk.proc.LinuxPtraceTask sendContinue
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1563,tid=1563,state=running}
sendContinue

17-Apr-07 5:36:11 PM frysk.proc.LinuxPtraceWaitBuilder getTask
FINE: {TaskId,1563} stopped

17-Apr-07 5:36:11 PM frysk.proc.Host get
FINE: {frysk.proc.LinuxPtraceHost@17f990,state=running} get TaskId

17-Apr-07 5:36:11 PM frysk.proc.LinuxPtraceTaskState$Running handleTrappedEvent
FINE: {frysk.proc.LinuxPtraceTask@37eee0,pid=1563,tid=1563,state=running}
handleTrappedEvent


Notice how, in the first trace, after the breakpoint has been processed the next
thing back is a signal 11 - sigsegv, while, for the second case, another
breakpoint is seen.

Comment 1 Andrew Cagney 2007-04-17 23:34:45 UTC

Dropped the tag cagney-20070413-pt-segv-branch.

Comment 2 Andrew Cagney 2007-04-17 23:46:04 UTC

Here are the dieing moments of the trace; pt=7 is PT_CONTINUE:

poke space=frysk.sys.Ptrace$AddressSpace@3c7970:USR pid=3664 index=114 value=f0
ptrace: pt=3 pid=3664 addr=0x114 data=0x0 -> 0xffff0ff0
ptrace: pt=6 pid=3664 addr=0x114 data=0xffff0ff0 -> 0x0
poke space=frysk.sys.Ptrace$AddressSpace@3c7970:USR pid=3664 index=115 value=f
ptrace: pt=3 pid=3664 addr=0x114 data=0x0 -> 0xffff0ff0
ptrace: pt=6 pid=3664 addr=0x114 data=0xffff0ff0 -> 0x0
poke space=frysk.sys.Ptrace$AddressSpace@3c7970:USR pid=3664 index=116 value=ff
ptrace: pt=3 pid=3664 addr=0x114 data=0x0 -> 0xffff0ff0
ptrace: pt=6 pid=3664 addr=0x114 data=0xffff0ff0 -> 0x0
poke space=frysk.sys.Ptrace$AddressSpace@3c7970:USR pid=3664 index=117 value=ff
ptrace: pt=3 pid=3664 addr=0x114 data=0x0 -> 0xffff0ff0
ptrace: pt=6 pid=3664 addr=0x114 data=0xffff0ff0 -> 0x0
ptrace: pt=12 pid=3664 addr=0x0 data=0x3c2378 -> 0x0
ptrace: pt=7 pid=3664 addr=0x0 data=0x5 -> 0x0
ptrace: pt=7 pid=3664 addr=0x0 data=0xb -> 0x0
ptrace: pt=16897 pid=3664 addr=0x0 data=0xbfb16ad8 -> 0x0
ptrace: pt=7 pid=3664 addr=0x0 data=0xb -> 0x0

and it's sending PT_CONT of signal 5/SIGTRAP to the process.  Per other
discussion, if in a signal handler this can kill the inferior.

Comment 3 Mark Wielaard 2007-04-18 12:34:56 UTC

(In reply to comment #2)
> and it's sending PT_CONT of signal 5/SIGTRAP to the process.  Per other
> discussion, if in a signal handler this can kill the inferior.

Not in this case since the funit-breakpoints uses the SA_NODEFER workaround. The
test does check that SIGTRAP and segfaults not caused by frysk itself are
propagated correctly. See bug #3997 for a discussion about this.

I'll checkout the branch and test first against the simpler unit tests first.
TestTaskObserverCode which only does a set breakpoint, run, check that it got
hit (maybe expand it to then continue running again), the
TestTaskObserverInstruction test, the TestTaskObserverInstructionSigReturn and
the combined TestTaskObserverInstructionAndCode tests. If those work then we can
proceed with the all singing and dancing TestBreakpoints which is really a
combination of all those tests.

Comment 4 Mark Wielaard 2007-04-18 13:10:24 UTC

On cagney-20070413-pt-segv-branch 9FC6 x86_64, kernel 2.6.20-1.2944:

testCode(frysk.proc.TestTaskObserverCode)
testInstruction(frysk.proc.TestTaskObserverInstruction)
both PASS

testStepSigReturn(frysk.proc.TestTaskObserverInstructionSigReturn)
testInstructionAndCode(frysk.proc.TestTaskObserverInstructionAndCode)
both FAIL

Comment 5 Andrew Cagney 2007-04-18 18:58:33 UTC

Found one issue:

2007-04-18  Andrew Cagney  <cagney@redhat.com>

        * RegisterSetByteBuffer.java (GetRegs.execute): Call
        registerSet.set, not get.

Comment 6 Mark Wielaard 2007-04-19 13:55:44 UTC

That issue seems to have been it.

frysk.proc.TestTaskObserverCode
frysk.proc.TestTaskObserverInstruction
frysk.proc.TestTaskObserverInstructionSigReturn
frysk.proc.TestTaskObserverInstructionAndCode
frysk.proc.TestBreakpoints

All PASS for me now.

Comment 7 Andrew Cagney 2007-04-19 15:21:05 UTC

We're missing a testcase for this root-cause failure.

Comment 8 Andrew Cagney 2007-04-19 18:03:20 UTC

2007-04-19  Andrew Cagney  <cagney@redhat.com>

        * TestByteBuffer.java (memoryByteBuffer, registerByteBuffer): Add.
        (setUp): Set.  For registerByteBuffer, only when REGS is valid.
        (verifySlice): New.
        (testSliceAddressSpace, testSliceRegisterSet): Use verifySlice.
        (verifyModify, testModifyRegisterSet, testModifyAddressSpace): New.
        (AsyncModify, testAsyncRegisterSet, testAsyncAddressSpace): New.