This is the mail archive of the gdb-prs@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug remote/17028] GDB+GDBserver hangs on Windows waiting for stop event since target-async on by default


https://sourceware.org/bugzilla/show_bug.cgi?id=17028

--- Comment #16 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gdb and binutils".

The branch, master has been updated
       via  364fe1f72265eb54bce08511233d06ed48e9c41a (commit)
      from  7ed689ad61de0cbfe4e5a6f18f097776128202e4 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=364fe1f72265eb54bce08511233d06ed48e9c41a

commit 364fe1f72265eb54bce08511233d06ed48e9c41a
Author: Pedro Alves <palves@redhat.com>
Date:   Wed Jun 11 11:04:31 2014 +0100

    PR remote/17028: GDB+GDBserver hangs on Windows

    Since target-async was turned on by default, debugging on Windows
    using GDB+GDBserver sometimes hangs while waiting for a RSP reply.

    The problem is a race in the gdb_select machinery.

    This is what we see for a faulty next on the GDB side:

        (gdb) n
        infrun: clear_proceed_status_thread (Thread 4424)
        infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT, step=1)
        (...)
        infrun: resume (step=1, signal=GDB_SIGNAL_0), ...
        Sending packet: $vCont;s:1148;c#5e...
        *hang*

    At this point, attaching a debugger to the hanging GDB confirms that
    it is blocked, waiting for a socket event:

        #6  0x757841d8 in WaitForMultipleObjects ()
           from C:\Windows\syswow64\kernel32.dll
        #7  0x004708e7 in gdb_select (n=469, readfds=0x88ca50
<gdb_notifier+784>,
            writefds=0x88cb54 <gdb_notifier+1044>,
            exceptfds=0x88cc58 <gdb_notifier+1304>, timeout=0x0)
            at /[...]/gdb/mingw-hdep.c:172
        #8  0x00527926 in gdb_wait_for_event (block=1)
            at /[...]/gdb/event-loop.c:831
        #9  0x00526ff1 in gdb_do_one_event ()
            at /[...]/gdb/event-loop.c:403

    However, on the GDBserver side, we see that GDBserver already sent a
    T05 packet reply:

        gdbserver: kernel event EXCEPTION_DEBUG_EVENT for pid=4968 tid=1148
        EXCEPTION_SINGLE_STEP
        Child Stopped with signal = 5
        Writing resume reply for LWP 4968.4424:1
        DEBUG: write_prim
($T0505:c8fe2800;04:a0fe2800;08:38164000;thread:1148;#f0)
               -> 55

    To recap, on Windows, 'select' only works with sockets, so we have a
    wrapper, gdb_select, that uses the GDB serial abstraction to handle
    sockets, consoles, pipes, and serial ports.  Each serial descriptor
    has a thread associated (we call those the select threads), and those
    threads communicate with the main thread by means of standard Windows
    events.

    It basically goes like this: gdb_select first loops through all fds of
    interest, calling their wait_handle hooks, which returns an event that
    WaitForMultipleObjects can wait on.  gdb_select then blocks in
    WaitForMultipleObjects with all those event handles.  The wait_handle
    hook is responsible for arranging for the returned event to become set
    once data is available.  This is done by setting the descriptor's
    helper thread running, which itself knows how to wait for data from
    the type of handle it manages (sockets, pipes, consoles, files, etc.).
    Once data arrives, the select thread sets the corresponding event
    which unblocks WaitForMultipleObjects within gdb_select.  However, the
    wait_handle hook can also apply an optimization: if data is already
    pending, then there's no need to set the thread running, and the
    descriptors event can be set immediately.  It's around this latter
    aspect that lies the bug/race.

    Adding some ad hoc debug logs to ser-mingw.c and mingw-hdep.c, we see
    the following sequence of events, right after sending
    "$vCont;s:1148;c#5e".  Thread 1 is the main thread, and thread 2 is
    the socket's helper/select thread.  gdb_select was only passed one
    descriptor to wait on, the remote target's socket.
    net_windows_select_thread is the entry point of the select threads for
    sockets.

     #1 - thread 1: gdb_select: enter
     #2 - thread 2: net_windows_select_thread: WaitForMultipleObjects blocking

    gdb_select walked over the wait_handle hooks, and woke up the socket's
    helper thread.  The helper thread is now blocked waiting for socket
    events.

     #3 - thread 1: gdb_select: WaitForMultipleObjects polling (timeout=0ms)
     #4 - thread 1: gdb_select: WaitForMultipleObjects returned 102
(WAIT_TIMEOUT)

    There was no pending data available yet, and gdb_select was passed
    timeout==0ms, and so WaitForMultipleObjects times out immediately.

     #5 - thread 2: net_windows_select_thread: WaitForMultipleObjects returned
1

    Just afterwards, socket data arrives, and thread 2 wakes up.  Thread 2
    calls WSAEnumNetworkEvents, which clears state->sock_event, and marks
    the serial's read_event event, telling the main thread that data is
    available.

     #6 - thread 1: gdb_select: call serial_done_wait_handle on each serial

    gdb_select stops all the helper/select threads.

     #7 - thread 1: gdb_select: return 0 (WAIT_TIMEOUT)

    gdb_select in the main thread returns to the caller.

    Note that at this point, data is pending on the socket, the serial's
    read_event is set, but the socket's sock_event event is not set, until
    _further_ data arrives.

    Now GDB does its thing and goes back to the event loop.  That calls
    gdb_select, but with timeout==INFINITE.

    Again, gdb_select calls the socket serial's wait_handle hook.  It
    first clears its events, starting from a clean slate:

      ResetEvent (state->base.read_event);
      ResetEvent (state->base.except_event);
      ResetEvent (state->base.stop_select);

    That cleared read_event, which was previously set in #5 above.  And
    then it checks for pending events, in the sock_event event:

      /* Check any pending events.  This both avoids starting the thread
         unnecessarily, and handles stray FD_READ events (see below).  */
      if (WaitForSingleObject (state->sock_event, 0) == WAIT_OBJECT_0)
        {

    That also fails because state->sock_event was cleared in #5 too...

    So the wait_handle hook erroneously decides that it needs to start the
    helper thread to wait for input:

     #8 - thread 2: net_windows_select_thread: WaitForMultipleObjects blocking
     #9 - thread 1: gdb_select: WaitForMultipleObjects blocking (INFINITE)

    But, GDBserver already sent all it had to send, so both threads waits
    forever...

    At first I thought that net_windows_wait_handle shouldn't be resetting
    state->base.read_event or state->base.except_event, but looking
    deeper, the pipe and console wait_handle hooks reset all events too.
    It actually makes sense that way -- consuming an event from different
    threads is bad practice, and, we should always be able to query
    pending state without looking at the state->sock_event from within
    net_windows_wait_handle.  The end result is much simpler, and makes
    net_windows_select_thread look a lot like console_select_thread,
    actually.

    gdb/
    2014-06-11  Pedro Alves  <palves@redhat.com>

        PR remote/17028
        * ser-mingw.c (net_windows_socket_check_pending): New function.
        (net_windows_select_thread): Ignore spurious wakeups.  Use
        net_windows_socket_check_pending.
        (net_windows_wait_handle): Check for pending events with
        ioctlsocket, through net_windows_socket_check_pending, instead of
        checking the socket's event.

-----------------------------------------------------------------------

Summary of changes:
 gdb/ChangeLog   |   10 ++++
 gdb/ser-mingw.c |  141 +++++++++++++++++++++++++++----------------------------
 2 files changed, 79 insertions(+), 72 deletions(-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]