This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Non-stop multi-threaded debugging


Hi all,

Jim Blandy prepared this, but is on vacation this week. So, I'm announcing it in his absence. Pretend I wrote 'sudo jimb ...'

A client of CodeSourcery's has contracted with us to implement a
number of new features in GDB, some of which have been on the
frequently requested list for quite some time:

- We're to implement non-stop multi-threaded debugging in GDB.

  At present, if you are debugging a multi-threaded program, when one
  thread stops (for a breakpoint, watchpoint, exception, or the like),
  GDB stops all other threads in the program while you interact with
  the thread of interest.  When you continue or step a thread, you can
  allow the other threads to run, or have them remain stopped, but
  while you inspect any thread's state, all threads stop.

  In non-stop mode, when one thread stops, other threads can continue
  to run freely.  You'll be able to treat each thread independently,
  leaving it stopped or free to run as needed.

  Non-stop mode will be selectable; the old all-stop behavior will
  still be available.

- We're to implement asynchronous interaction with GDB.

  GDB will be responsive to commands while the program is running.
  This is mostly a consequence of supporting non-stop multi-threaded
  debugging: it's the degenerate case where no threads happen to be
  stopped.

- We're to implement a limited form of multi-process debugging.

  Full multi-process debugging would entail changes to
  1) process management code,
  2) target interfaces, and
  3) symbol tables.

  For our client, however, the case where processes have different
  memory maps is not (yet) of interest, so they have sponsored us to
  do 1) and 2), but not 3).  This will yield a GDB that can (for
  example) follow both parent and child after a fork, but not follow
  processes across exec or dlopen/dlclose operations.  If a process
  carries out one of these operations, GDB will ask the user whether
  to follow that process only, or detach from it and stick with the
  others.

  So our goal here is to carry out steps 1) and 2) in such a way that
  anyone can easily pick up 3) and complete the feature.  In other
  words, we want the restrictions simply to be a matter of leaving
  work undone, and not of embedding simplifying assumptions into the
  code that would make full support difficult.

Our client would very much like for this work to be incorporated into
the public GDB sources (although they understand that the decision is
in the public project's hands), so we'll be posting our design
thoughts for general discussion.  In particular, I believe the
multi-process work may overlap with some of the work IBM has done to
support the Cell processor; we'd very much like to work with IBM to
ensure that the final model is appropriate for both our client and for
Cell developers.

Our client is only interested in the MI interface; they intend to use
all these facilities via Eclipse.  So we will not be implementing
command-line support any more than is helpful to us in development.
But again, we want to do this work in a way that leaves CLI support
for these features a simple matter of coding, so that our work is
still forward progress, which anyone can complete.

Our client is interested in non-stop, multi-process debugging via the
remote protocol.  However, we will be implementing these for native
debugging first, in order to break the work into manageable steps.

The below is taken from a more detailed document we put together
proposing the work.  It is in two sections:

- The "Architectural Challenges" section explains limitations of GDB's
  current architecture that make it difficult to implement non-stop
  and multi-process debugging at present.

- The "Projects" section presents a series of well-defined engineering
  projects which remove limitations or add features to meet one or
  more of our client's requirements.

Our intention is to help the list understand why each piece of work is
needed and what it would accomplish.


Architectural Challenges


GDB's present architecture imposes a number of barriers to
implementing non-stop and multi-process debugging:

C1) While the user inspects the state of a stopped thread, GDB stops
    all other threads.  This approach simplifies GDB's user interface,
    as there is no need to report events taking place in other threads
    while the user inspects one thread.  However, these
    simplifications are no longer valid in non-stop debugging

C2) Stopping all threads also simplifies GDB's execution management
    code, as GDB can pause all threads, manage interesting events, and
    then assume the system is quiet.  As above, these simplifications
    are no longer valid in non-stop debugging.

C3) Stopping all threads further allows GDB to remove all breakpoints
    from the program's memory while the program is stopped, and
    re-insert them only when resuming one or more threads, making it
    less likely that an abrupt disconnection will abandon a debuggee
    with breakpoint instructions patched into its code.  However, this
    behavior is clearly unsuitable if the user wants other threads to
    continue to execute while she stops one for inspection.

C4) Finally, stopping all threads simplifies GDB's remote protocol.
    At present, GDB's remote protocol notifies GDB of exactly one
    thread's state in response to each 'continue' or 'step' operation,
    permitting no further packets from the stub until GDB resumes
    some thread.

C5) GDB breakpoints are currently per-thread or global.  To satisfy
    our client's requirements, we must adapt these structures to
    distinguish per-process and global breakpoints, where 'global'
    breakpoints are set in all attached processes.

C6) [Our client elected not to address this issue yet.]

C7) GDB currently operates on a single process at a time: the list of
    known threads is global, and the ID of the process being debugged
    is global.  This conflicts with the needs of multi-process
    debugging.

C8) GDB currently maintains a single global map of the address space.
    It cannot represent multiple processes with code and data
    appearing at different addresses in different processes.  This is
    not a problem for our client, because code and variables appear at
    the same addresses in all processes on their system.  However, it
    is a requirement for multi-process debugging on Linux.

C9) GDB will not currently relocate different segments of an
    executable or shared library by different offsets from the
    addresses they are assigned in the ELF file.  The client's
    operating system may relocate each section of a load module by a
    different amount.


Projects


This section breaks down the work necessary into well-defined
engineering tasks.  For each proposed project, we explain the work
entailed, the benefits provided, and how it depends on other projects,
if at all.


P1) Non-stop multi-threaded native debugging


    This project allows GDB to stop one thread for inspection on a
    native system while allowing others to run.

    To prepare GDB to debug one process while other processes continue
    to run freely (the feature our client is interested in), we will
    first implement the ability to debug one thread while other
    threads in that process continue to run freely.

    As described in C1, C2, and C3, GDB assumes in its user interface
    and code that no execution occurs while the user is inspecting a
    thread's state.  This project removes that simplifying assumption.

    At the user interface level, GDB's Machine Interface ('MI', the
    command set used by Eclipse) shall behave as follows:

    - MI shall provide a command to allow the user to choose between
      the older 'all-stop' and the new 'non-stop' multi-threaded
      debugging behaviors.  In all-stop mode, GDB shall behave as it
      does now.  The following points describe non-stop debugging
      mode.

    - GDB shall always prompt for and respond to MI commands,
      regardless of whether any threads are running or not.

    - When a thread finishes a command like '-exec-next' or
      '-exec-finish', hits a breakpoint, or encounters a fault, GDB
      shall stop that thread, without affecting the other threads in
      the process.

    - Execution commands like '-exec-continue' and '-exec-step' shall
      resume only the selected thread, without affecting the other
      threads in the process.

    - The MI '-exec-interrupt' command shall stop all threads.  This
      will always generate an 'EXEC-ASYNC-OUTPUT' record, even if all
      threads were already stopped.  (This helps users handle the case
      where the thread stops of its own accord just as the user sends
      it an '-exec-interrupt' command.)

    - The MI '-thread-select' command shall stop the thread selected,
      if it is running.  The previously selected thread is left in its
      former state, either stopped or running.  A '-thread-select'
      command shall always generate an 'EXEC-ASYNC-OUTPUT' record,
      even if the thread was already stopped.

- MI shall provide a command to continue all stopped threads.

    - GDB shall send 'EXEC-ASYNC-OUTPUT' MI records to notify the user
      of events that have occurred in threads, even while GDB is
      waiting for an MI command.  Every thread GDB stops shall be
      mentioned in some 'EXEC-ASYNC-OUTPUT' record; when GDB stops all
      threads, the EXEC-ASYNC-OUTPUT record shall include a
      'thread-id="all"' result.

    - The MI '-thread-info' and '-thread-list-all-threads' commands
      shall be implemented.  Their output shall indicate whether each
      thread listed is currently stopped by GDB, or whether it is
      allowed to run.

    - GDB shall use 'EXEC-ASYNC-OUTPUT' MI records to report thread
      creation and termination.  These records shall include the GDB
      thread number as a result.  After sending a thread termination
      record, GDB shall not include the thread in the output of
      '-thread-list-ids' or '-thread-list-all-threads'.

    (Adapting GDB's command-line interface to non-stop debugging is
    more involved; whereas MI need only be accurate and sufficient,
    the command-line interface must also respect human interface
    issues.  Since GDB's command-line interface is of limited interest
    to our client, we have not included it here.)

    To implement the behavior described above, a number of areas
    within GDB will need modification:

    - GDB's event loop must be responsive to user input and thread
      events from the debuggee simultaneously.

    - GDB's execution control code must avoid stopping all threads
      when one reports an event, and must make the processing of
      thread stops independent of resumption: it must no longer assume
      that events only arrive after resumptions, and resumptions only
      happen after events.

    - GDB must insert breakpoints into code being executed by live
      threads in a manner supported by the target architecture.

    - GDB's breakpoint support code must leave breakpoints inserted at
      all times.  Even while GDB steps a thread past a breakpoint,
      the breakpoint must remain in effect for all other threads.

    These are each reasonably substantial pieces of work, the design
    of which should be discussed on the public GDB list to ensure that
    the work will be acceptable for inclusion in the public sources
    when it is complete.


P4) Stub for client's OS


    This project will mostly be non-GDB work.  However, there are some
    changes to the remote protocol we would like to introduce at this
    point:

    The remote protocol presently leaves the process to be debugged
    implicit; users generally specify it when they start the stub.
    However, to satisfy our client's requirements, we must be able to
    connect to a system, list the processes present, and attach to one
    of them.  This entails making some straightforward extensions to
    the GDB remote protocol, and thus to GDB as well.

    The stub for our client should use the 'library' stop reply
    packets and the 'qXfer:libraries:read' packet to report load
    module events.  However, because the client's OS may bring each
    section of a load module into memory at a different offset from
    the VMA given in the ELF file, we will need to extend the format
    of the library list the latter packet returns, as it currently
    assumes that each library needs only one offset, and extend GDB to
    allow each segment to appear at a different offset (C9).


P6) Multi-threaded limited-multi-process native debugging


    This project provides multi-threaded debugging of multiple
    processes simultaneously.  The debugger stops all threads in all
    attached processes while the user inspects the state of any
    thread.  This work is independent of P1; we combine P1 and P6 in
    the next project, P7.

At the user interface level:

    - MI shall provide new commands to attach and detach a process;
      unlike GDB's existing 'attach' and 'detach' commands, the new
      'attach' command will not require GDB to detach from any
      currently attached processes.

    - MI shall provide a command to list all currently attached
      processes.

    - MI shall provide a command to list all the threads in a given
      attached process.

    - The output of the MI '-thread-info' and
      '-thread-list-all-threads' commands shall include the process ID
      of each thread listed.  The process ID shall be a separate MI
      'result' from the string provided by the
      'target_extra_thread_info' function, so that Eclipse can access
      it reliably.

    - GDB shall stop all threads in all attached processes while
      interacting with the user.  Attaching to a process shall stop
      all threads in that process.  Detaching from a process shall
      allow its threads to run again.

    - MI shall report faults encountered by threads in any attached
      process.

    - MI shall report the termination of any attached process.  After
      such a report, GDB will no longer be attached to the process.

    - MI's '-thread-select' command shall be able to select any thread
      in any attached process.

    - MI's existing breakpoint commands shall set breakpoints global
      to all attached processes.

To support those facilities, we need the following changes:

    - GDB shall maintain a table of attached processes.  The remote
      protocol shall provide packets directing the stub to attach to a
      new process, and to detach from a currently attached process.

    - The remote protocol shall carry process IDs as well as thread
      IDs in stop reply packets, thread selection packets, thread
      enumeration packets, and wherever else is appropriate.

    - The stub shall use the current general thread (as given by the
      'Hg' packet) to determine which process's memory to access, as
      it does now to determine which thread's registers to access.
      GDB shall send 'Hg' packets as necessary before memory accesses,
      as it does now for register accesses.


P7) Non-stop multi-threaded multi-process native debugging


    This project allows GDB to attach to multiple processes
    simultaneously.  This builds on P1 and P6, and addresses C4 and
    C7.

At the user interface level:

    - MI shall provide a command to stop all the threads in a given
      process, and a command to resume all stopped threads in a given
      process.

Internally:

    - The remote protocol shall provide a way to tell the stub to
      leave other threads running after reporting an event in one
      thread (non-stop behavior), and a way to tell the stub to stop
      all threads when reporting an event in one thread (stop-all
      behavior).

    - The remote protocol shall allow the stub to respond to
      commands while threads are running, and to report further thread
      events after a thread has stopped.  (This addresses C4.)

    - The remote protocol shall provide ways to stop and start a
      particular thread, and ways to start and stop all the threads in
      a given process.  The mechanisms for stopping threads and
      processes shall allow GDB to behave correctly when a thread
      stops or a process exits simultaneously with GDB sending the
      command.

nathan
--
Nathan Sidwell    ::   http://www.codesourcery.com   ::         CodeSourcery


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]