This is the mail archive of the
cygwin-developers@cygwin.com
mailing list for the Cygwin project.
[bug found] Re: cygwin hang problem
OK, I think I see what the problem may be. In the dll_func_load
code (assembly language), the dll linkage code is patched (rewritten)
once the address of the loaded dll function is known. The problem
is that there is a race -- the new opcode and its argument
are written separately. What happens is this:
1. a mov instruction is overwritten with 0xe9 to become a jmp
2. another thread executes the jmp before step 3
3. the newly written jmp instruction gets the proper offset written
Since the mov instruction uses an offset from the beginning of the segment,
and the jmp uses an EIP-relative offset, the net effect is that the jmp
goes off in the weeds. The data in the dll linkage code is what causes
the observed behavior of a jump to twice the value of the linkage data --
the mov instruction references memory just a few bytes further down.
In the core that I am looking at, here is what is at win32_CopySid@12:
0x610f00b8: 0xa1 0xbf 0x00 0x0f 0x61 # mov 0x610f00bf,%eax
This becomes -- at just the wrong moment:
0x610f00b8: 0xe9 0xbf 0x00 0x0f 0x61 # jmp %eip+0x610f00bf
So the locking needs some changing in the dll linkage code. There is in fact
a comment above dll_func_load that the code may not be thread safe!
Joe Buehler