This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Figured out how to reproduce vfork/rsync bug!


On Thu, Sep 06, 2001 at 02:47:47PM -0400, Jonathan Kamens wrote:
> >  Date: Thu, 6 Sep 2001 20:39:47 +0200
> >  From: Corinna Vinschen <vinschen@redhat.com>
> >  
> >  Nothing special. It just said BOOOOM!
> 
> Ah.  Did I forget to mention that sometimes this bug kills *all* your
> cygwin processes, not just the processes that fail.  Sorry about that
> :-).

You dare! ;-)

> Incidentally, I've managed to isolate it to something that was checked
> in between July 16 and July 17 (i.e., between "cvs update -D7/16/2001"
> and "cvs update -D7/17/2001"; I'm not sure whether that uses midnight
> local time, time on the CVS server or GMT).  I'll let you know as I

It uses your local time while the timestamps in cygwin-cvs are
always PST (PDT, currently).  I for one have to add 9 hours to
get localtime.

> find out more.
> 
> >  I still have to examine the stackdump...
> 
> You mean it's actually possible to derive useful information from
> those stackdumps?  I've not been able to find anybody here who knows
> how to do that.  Do tell.

Ok, here we go.  The stackdump I got was created by the application
which raised the exception, here `rsync', so I got a new file
`rsync.exe.stackdump'.  Let's look into it's contents:

$ cat rsync.exe.stackdump
Exception: STATUS_ACCESS_VIOLATION at eip=61024931
eax=FFFFFFFF ebx=614F020C ecx=0242FF08 edx=02436720 esi=610AD578 edi=610248D4
ebp=02420000 esp=0242F9EC program=C:\cygwin\bin\rsync.exe
cs=001B ds=0023 es=0023 fs=003B gs=0000 ss=0023
Stack trace:
Frame     Function  Args
  56182 [main] rsync 2384 handle_exceptions: Error while dumping state (probably corrupted stack)


Hmm, that's not that good.  The stack is corrupted so the backtrace
didn't work.  Ok, let's try with another stackdump from another
crash:

Exception: STATUS_ACCESS_VIOLATION at eip=6109DD7D
eax=00000001 ebx=0A017908 ecx=0000C008 edx=0000C009 esi=0A023908 edi=0A017900
ebp=0022F644 esp=0022F61C program=C:\cygwin\bin\sh.exe
cs=001B ds=0023 es=0023 fs=003B gs=0000 ss=0023
Stack trace:
Frame     Function  Args
0022F644  6109DD7D  (610AB020, 0A017908, 614C3CC8, 6102CAD3)
0022F674  61036632  (0A017908, 00000124, 0022FF18, 77E9DCBE)
0022F6A4  610364B6  (0A017908, 00000000, 0022F6F4, 6102E54A)
0022F6C4  61065705  (00412984, 004129D4, 00230178, 00230178)
0022F6F4  6102E552  (0022F864, 0022F868, 0022F86C, 00000103)
0022F874  6102F5BB  (00412984, 0022F8CC, 00000003, 61074E73)
0022F894  00406E84  (0A0178A0, 004129D4, 00000000, FFFFFFFF)
0022F8D4  004022B3  (00412970, 00000000, 0022F904, 61093102)
0022F904  00401D7D  (00412970, 00000001, 0022FAC4, 0040B905)
0022F954  0040254A  (00412970, 0022FA44, 0022FA54, 0040489B)
End of stack trace (more stack frames may be present)

What you see is the call stack of the crashing application at the
moment the crash happened.  The first few lines print the contents
of the CPU registers, the rest is the list of functions currently
on the stack.  The uppermost function is the latest which has been
called (and is actually the one in which the crash has happened),
the next function is the function which called the first function
and so forth.  `Frame' means the position on the stack at which the
local vars are stored, `Function' is the address in the function
at which the next function has been called.  For the uppermost
function it's the crash address.  `Args' are just the first 16 bytes
of arguments to the function.

To examine the addresses you can simply start gdb:

$ gdb -nw /bin/sh
GNU gdb 5.0 (20010428-1)
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-cygwin"...
(gdb) dll cygwin1.dll
(gdb) disas 0x6109DD7D
Dump of assembler code for function _free_r:
0x6109dd0c <_free_r>:   Cannot access memory at address 0x6109dd0c

Ok, now we at least know that this example crashed in free_r()...

(gdb) disas 0x61036632
Dump of assembler code for function export_free:
0x610365e0 <export_free>:       Cannot access memory at address 0x610365e0

...which has been called from export_free() etc.

For the above `rsync' crash we have only minimal information
in the stackdump, unfortunately.  However, did you see that
the content of the `eip' register is identical to the function
address of the uppermost (crashing) function?  So we can at least
get the information in which function `rsync' crashed by asking
gdb for the function of the address in `eip':

(gdb) disas 0x61024931
Dump of assembler code for function fixup_after_fork__15fhandler_socketPv:
0x61024914 <fixup_after_fork__15fhandler_socketPv>:     Cannot access memory at address 0x61024914

Ok, so it's fhandler_socket::fixup_after_fork().

This should just give an idea.  Examining the stackdump is
not the same as live debugging with gdb.  You'll get way
more information _if_ the problem persists under debugger
control... which isn't a matter of course :-(

Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Developer                                mailto:cygwin@cygwin.com
Red Hat, Inc.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]