This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: still can't duplicate rxvt problem -- new information


On Wed, Aug 29, 2001 at 12:14:19PM -0400, Jason Tishler wrote:
>I was able to reproduce the problem with 1.3.2:
>
>    C:\home\jt>bash --login
>    CYGWIN_NT-5.0 ALTHEA 1.3.2(0.39/3/2) 2001-05-20 23:28 i686 unknown
>    ...
>    TERM (rxvt) =
>    althea[~]
>    $ id
>    althea[~]
>    $ uid=1000(jt) gid=513(None) groups=0(Everyone),513(None),544(Administrators),545(Users),1001(Daemons)
>    logout
>
>Note the following:
>
>1. My prompt is two line.
>2. The prompt appears before the output of id.
>
>Also, I have *not* been able to reproduce the bash --login problem via
>rxvt under 1.3.2 (yet).  Go figure...
>
>> >I'm also willing to post (compressed) strace logs of a good and bad test
>> >run for perusal.  The compressed logs are about 128 KB each so I can put
>> >them on my web site for download if that is preferred.
>> 
>> The URL is preferable.
>
>The strace logs are available at:
>
>    http://www.tishler.net/jason/misc/bash.log.bz2
>    http://www.tishler.net/jason/misc/bash2.log.bz2
>    http://www.tishler.net/jason/misc/bash3.log.bz2
>
>Note the following:
>
>1. bash.log.bz2 and bash2.log.bz2 are traces when bash logs out
>2. bash3.log.bz2 is a trace when bash does *not* log out

Ok.  I think I've finally figured this out from inspecting the enclosed strace
logs and looking at bash sources.  I haven't tried to duplicate this myself
yet but I may get to doing that sometime today.  In the meantime...

This is, IMO, a bash bug.  I've reported it before and worked around it in
cygwin.  The problem is that Windows can reuse pid numbers very quickly and
this confuses bash.  I work around this by keeping a "cache" of process handles
in cygwin.  Keeping an open handle to an exited process means that Windows can't
reuse the pid right away.

In Jason's case, that doesn't really matter since it isn't just the fact
that the pid was used recently that is a problem.  It is the fact that
it was used "relatively" recently, in an specific fashion.  Apparently
that is what is confusing bash.  This causes bash not to wait for the
completion of the 'id' command (in Jason's example) which means that it
will try to read from the terminal while 'id' has control of the
terminal.  This results in a 'background read' operation in what should
be the foreground version of bash.  This ends up terminating the
foreground bash.

So, I'd like to propose two potential tests:

Jason: Please build a version of cygwin1.dll with the attached fork.cc
patch.  See if that "solves" your problem.  I don't believe that this is
a real solution since keeping a huge number of open process handles in
cygwin1.dll does not seem like a very clean way to handle this problem.

There is a possible solution in bash for this, though.  If you could
build a version of bash which defines RECYCLES_PIDS when compiling
execute_cmd.c, it would be interesting to find out if that fixes the
problem.  From looking at the bash code, I'm not sure how it works, but
maybe I'll indulge in magical thinking just this once.

I've included a configure.in/configure patch which will force this to be
the default for cygwin.

So, the steps are:

1) Patch fork.cc and rebuild cygwin.

2) See if your problem has magically vanished.

3) If so, revert the change to fork.cc and rebuild cygwin1.dll.

4) Apply the patch to the current bash sources, and build bash.

5) See if your problem has magically vanished.

6) Report back here.

If 4 + 5 actually solves the bash problem, then we can ask Corinna to make
a new release of bash with this change, and try to get the patch included
into the main bash sources.

If 1 + 2 actually solves the problem but 4 + 5 doesn't then, if you could
inspect the code in execute_cmd.c which uses RECYCLES_PIDS and see if
you can see another workaround, that would be helpful.

If 1 + 2 doesn't cause the problem to go away then I guess it's back to
the drawing board for me.

cgf

fork-patch.gz

bash-configure-patch.gz


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]