This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: 1.1.3 and upwards: apparent bug with pthread_cond_wait() and/or signal()




> -----Original Message-----
> From: Michael Beach [mailto:michaelb@ieee.org] 
> Sent: Thursday, May 02, 2002 2:16 AM
> >
> > thread - lock
> > thread - state=run
> > thread - signal
> > main - lock
> > main - test state (passes)
> 
> calls pthread_cond_wait().

Doh. I need some real serious sleep. 

"Linux 
> made me do it".

:].
 
> However if you're not expecting high bandwidth, if you could 
> point me at a 
> document or whatnot that explains how to set up a development 
> environment I'd 
> be willing to have a go.

There are very few developers contributing to pthreads code - right now
I'm swamped, and a new contributor has offered some high quality
patches. Http://www.cygwin.com/cvs.html explains how to grab the current
source. You could also just click on the 'src' checkbox beside the
cygwin package in setup.exe, to get it to download a snapshot.
 
> Sure. The quick'n'dirty pthreads calls were only so I didn't 
> have to post 
> half of our source tree in order to illustrate the problem 
> with an example 
> that actually compiles. If you're serious about wanting to 
> run it, give me a 
> shout and I'll give you a version with error handling.

I can duplicate the hang. What appears to be happenning is that signals
sent from a thread when another thread is entering?exiting? the wait
routine get dropped.
The main signal() routine finds 0 waiting threads (see thread.cc:495)
when it is called, so it does nothing.

A - main thread
b - new thread
L - lock
W - wait
S - signal
J - join
U - unlock

Fails
A B
L
W
  L
  S (1)
  W
S <-- is dropped
U
  U
J


Ok, in detail
S (1)
does this:
 lock the cond variable
 signals A
 waits for A to wake to prevent dropped signals
 unlocks the cond struct
then the W
 locks the cond variable
 increases the waiting count
 waits, releasing the mutex and unlocking the cond variable
  
A on waking does this:
  decrements the waiting count (now 0)
  tells the S(1) routine that it's woken up
  Locks the mutex that it's waiting on.
  (*)clears the cond structure's cached mutex entry if it's the last
waking thread
  locks the cond structure
  decrements the mutex's wait reference
  unlocks the cond structure.

(*) was buggy. So what is happening is that the W when it releases the
mutex, did so AFTER A tested for being the last thread, so A's test was
flawed. I've a fix ready, I just need to get some time to test, which I
will do tonight. If you want to test it, it's

Index: thread.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/thread.cc,v
retrieving revision 1.65
diff -u -p -r1.65 thread.cc
--- thread.cc	28 Feb 2002 13:50:41 -0000	1.65
+++ thread.cc	2 May 2002 08:42:21 -0000
@@ -1791,20 +1791,22 @@ __pthread_cond_dowait (pthread_cond_t *c
   InterlockedIncrement (&((*themutex)->condwaits));
   if (pthread_mutex_unlock (&(*cond)->cond_access))
     system_printf ("Failed to unlock condition variable access mutex,
this %p", *cond);
+  /* At this point calls to Signal will progress evebn if we aren' yet
waiting
+   * However, the loop there should allow us to get scheduled and call
wait,
+   * and have them call PulseEvent again if we dont' respond.
+   */
   rv = (*cond)->TimedWait (waitlength);
   /* this may allow a race on the mutex acquisition and waits..
    * But doing this within the cond access mutex creates a different
race
    */
-  bool last = false;
-  if (InterlockedDecrement (&((*cond)->waiting)) == 0)
-    last = true;
+  InterlockedDecrement (&((*cond)->waiting));
   /* Tell Signal that we have been released */
   InterlockedDecrement (&((*cond)->ExitingWait));
   (*themutex)->Lock ();
-  if (last == true)
-    (*cond)->mutex = NULL;
   if (pthread_mutex_lock (&(*cond)->cond_access))
     system_printf ("Failed to lock condition variable access mutex,
this %p", *cond);
+  if ((*cond)->waiting == 0)
+    (*cond)->mutex = NULL;
   InterlockedDecrement (&((*themutex)->condwaits));
   if (pthread_mutex_unlock (&(*cond)->cond_access))
     system_printf ("Failed to unlock condition variable access mutex,
this %p", *cond);


Rob

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]