This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug runtime/10427] Multiple tracepoint OKs!

From: "jistone at redhat dot com" <sourceware-bugzilla at sourceware dot org>
To: systemtap at sources dot redhat dot com
Date: 22 Jul 2009 18:04:49 -0000
Subject: [Bug runtime/10427] Multiple tracepoint OKs!
References: <20090722124254.10427.mjw@redhat.com>
Reply-to: sourceware-bugzilla at sourceware dot org

------- Additional Comments From jistone at redhat dot com  2009-07-22 18:04 -------
(In reply to comment #2)
> O, I think I see what happens. Probably while we are doing the probe, printing
> the message and calling exit() some other tracepoint is hit, hits is already >
> 100 and so the probe gets triggered again. So this is probably just a testsuite
> bug. Although a slightly confusing one.

Yeah, there's an exit race condition.  The probe flow looks something like this:

1. If session_state != RUNNING, skip the probe
2. Lock the hits global
3. Increment hits
4. If hits >= 100, call exit() (which sets session_state = STOPPING)
5. Unlock hits

So 2-5 are atomic, but there is a chance that other cpus could have gotten past
#1 before exit() is called.

Reversing #1 and #2 would solve the race for this particular script, but
generally probe handlers may not have any mutual locks.  That's why we don't
make any guarantees that exit() will force any in-flight probes to skip.

Your fix looks fine.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=10427

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

References:
- [Bug runtime/10427] New: Multiple tracepoint OKs!
  - From: mjw at redhat dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]