This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: cygwin.dll: bug with select on Windows console


Hi John,

On Dec 27 20:49, john hood wrote:
> Hi all,
> 
> I'm one of the Mosh maintainers.  Recently a user reported a problem
> where Mosh exits suddenly soon after startup while he is typing at it,
> see <https://github.com/mobile-shell/mosh/issues/705>.
> 
> The problem turns out to be that occasionally, select() times out,
> returns 0 as it should, but does *not* clear the fd_sets() passed in--
> they are left unaltered.  Mosh doesn't pay any attention to the count of
> ready fds and relies on the returned fd_sets being accurate.  Mosh sets
> the read fd_set and the error fd_set, and when it encounters this
> situation, it exits quietly when an error is found on one of the files
> involved (rather poor error handling on our part).
> 
> This only seems to happen on Windows Console.  select() seems to operate
> reliably when used on a pty, whether from mintty, xterm, or sshd.
> 
> My read of the POSIX standard is that select() should always set the
> fd_sets on a successful return (rv >= 0).  There is a bit of ambiguity
> around this point, but given that Cygwin is inconsistent with itself (on
> ptys) and with every other Unix platform, I think it's a bug.
> 
> I've attached a little demo program for the bug.  Compile, run as
> "socket-t 1000" (the argument is the number of microseconds select()
> should wait), and mash keys on the keyboard for a little while.  It
> should report errors within 100 keystrokes.  I think there might be a
> dependency on the length of the wait passed to select(), I don't see the
> problem happening with the wait set to 100 seconds.

Unfortunately you missed to attach the testcase.  I checked Cygwin's
select and I can see what might be the reason for what you're observing.
I might even have a fix.  But what bugs me is that I don't see how this
condition could be met at all, especially not in case of selecting on
the console.  Your testcase would be greatly appreciated.

For further discussion I attached the patch I'm proposing.  AFAICS,
copying the content of r,w,e over to readfds,writefds,exceptfds doesn't
really make sense, unless r,w,e are still unchanged (i.e. zeroed out).
However, *if* r,w,e have set bits, sel.wait should have returned
select_ok, not select_set_zero.  I don't see how this might occur.
That's what your testcase hopefully helps to uncover.

Another hiccup with copying r,w,e to readfds,writefds,exceptfds and then
calling sel.poll is this:  The number of set bits is *only* determined
by sel.poll.  If there are any other bits set in r,w,e, the result of
sel.poll might not reflect this and the return value is not equal to the
number of bits set in the records.  Obviously this is not a problem in
your case since you don't check the return value, but other applications
might.


Thanks,
Corinna


diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
index 524e578..647c758 100644
--- a/winsup/cygwin/select.cc
+++ b/winsup/cygwin/select.cc
@@ -77,8 +77,6 @@ details. */
   (fd_set *) __res; \
 })
 
-#define copyfd_set(to, from, n) memcpy (to, from, sizeof_fd_set (n));
-
 #define set_handle_or_return_if_not_open(h, s) \
   h = (s)->fh->get_io_handle_cyg (); \
   if (cygheap->fdtab.not_open ((s)->fd)) \
@@ -132,6 +130,7 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
 	DWORD ms)
 {
   int res = select_stuff::select_loop;
+  int ret = 0;
 
   /* Record the current time for later use. */
   LONGLONG start_time = gtod.msecs ();
@@ -187,14 +186,15 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
       select_printf ("res %d", res);
       if (res >= 0)
 	{
-	  copyfd_set (readfds, r, maxfds);
-	  copyfd_set (writefds, w, maxfds);
-	  copyfd_set (exceptfds, e, maxfds);
-	  if (res == select_stuff::select_set_zero)
-	    res = 0;
-	  else
-	    /* Set the bit mask from sel records */
-	    res = sel.poll (readfds, writefds, exceptfds) ?: select_stuff::select_loop;
+	  UNIX_FD_ZERO (readfds, maxfds);
+	  UNIX_FD_ZERO (writefds, maxfds);
+	  UNIX_FD_ZERO (writefds, maxfds);
+	  /* Set bit mask from sel records, even in case of a timeout.  This
+	     also sets ret to the right value >= 0, matching the number of
+	     bits set in the fds records. */
+	  ret = sel.poll (readfds, writefds, exceptfds);
+	  if (!ret && res != select_stuff::select_set_zero)
+	    res = select_stuff::select_loop;
 	}
       /* Always clean up everything here.  If we're looping then build it
 	 all up again.  */
@@ -219,9 +219,9 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
 	}
     }
 
-  if (res < -1)
-    res = -1;
-  return res;
+  if (res < 0)
+    ret = -1;
+  return ret;
 }
 
 extern "C" int

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

Attachment: signature.asc
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]