This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Non-stop multi-threaded debugging
- From: Nathan Sidwell <nathan at codesourcery dot com>
- To: gdb at sourceware dot org
- Cc: Jim Blandy <jimb at codesourcery dot com>
- Date: Tue, 20 Nov 2007 17:21:14 +0000
- Subject: Non-stop multi-threaded debugging
Hi all,
Jim Blandy prepared this, but is on vacation this week. So, I'm announcing it
in his absence. Pretend I wrote 'sudo jimb ...'
A client of CodeSourcery's has contracted with us to implement a
number of new features in GDB, some of which have been on the
frequently requested list for quite some time:
- We're to implement non-stop multi-threaded debugging in GDB.
At present, if you are debugging a multi-threaded program, when one
thread stops (for a breakpoint, watchpoint, exception, or the like),
GDB stops all other threads in the program while you interact with
the thread of interest. When you continue or step a thread, you can
allow the other threads to run, or have them remain stopped, but
while you inspect any thread's state, all threads stop.
In non-stop mode, when one thread stops, other threads can continue
to run freely. You'll be able to treat each thread independently,
leaving it stopped or free to run as needed.
Non-stop mode will be selectable; the old all-stop behavior will
still be available.
- We're to implement asynchronous interaction with GDB.
GDB will be responsive to commands while the program is running.
This is mostly a consequence of supporting non-stop multi-threaded
debugging: it's the degenerate case where no threads happen to be
stopped.
- We're to implement a limited form of multi-process debugging.
Full multi-process debugging would entail changes to
1) process management code,
2) target interfaces, and
3) symbol tables.
For our client, however, the case where processes have different
memory maps is not (yet) of interest, so they have sponsored us to
do 1) and 2), but not 3). This will yield a GDB that can (for
example) follow both parent and child after a fork, but not follow
processes across exec or dlopen/dlclose operations. If a process
carries out one of these operations, GDB will ask the user whether
to follow that process only, or detach from it and stick with the
others.
So our goal here is to carry out steps 1) and 2) in such a way that
anyone can easily pick up 3) and complete the feature. In other
words, we want the restrictions simply to be a matter of leaving
work undone, and not of embedding simplifying assumptions into the
code that would make full support difficult.
Our client would very much like for this work to be incorporated into
the public GDB sources (although they understand that the decision is
in the public project's hands), so we'll be posting our design
thoughts for general discussion. In particular, I believe the
multi-process work may overlap with some of the work IBM has done to
support the Cell processor; we'd very much like to work with IBM to
ensure that the final model is appropriate for both our client and for
Cell developers.
Our client is only interested in the MI interface; they intend to use
all these facilities via Eclipse. So we will not be implementing
command-line support any more than is helpful to us in development.
But again, we want to do this work in a way that leaves CLI support
for these features a simple matter of coding, so that our work is
still forward progress, which anyone can complete.
Our client is interested in non-stop, multi-process debugging via the
remote protocol. However, we will be implementing these for native
debugging first, in order to break the work into manageable steps.
The below is taken from a more detailed document we put together
proposing the work. It is in two sections:
- The "Architectural Challenges" section explains limitations of GDB's
current architecture that make it difficult to implement non-stop
and multi-process debugging at present.
- The "Projects" section presents a series of well-defined engineering
projects which remove limitations or add features to meet one or
more of our client's requirements.
Our intention is to help the list understand why each piece of work is
needed and what it would accomplish.
Architectural Challenges
GDB's present architecture imposes a number of barriers to
implementing non-stop and multi-process debugging:
C1) While the user inspects the state of a stopped thread, GDB stops
all other threads. This approach simplifies GDB's user interface,
as there is no need to report events taking place in other threads
while the user inspects one thread. However, these
simplifications are no longer valid in non-stop debugging
C2) Stopping all threads also simplifies GDB's execution management
code, as GDB can pause all threads, manage interesting events, and
then assume the system is quiet. As above, these simplifications
are no longer valid in non-stop debugging.
C3) Stopping all threads further allows GDB to remove all breakpoints
from the program's memory while the program is stopped, and
re-insert them only when resuming one or more threads, making it
less likely that an abrupt disconnection will abandon a debuggee
with breakpoint instructions patched into its code. However, this
behavior is clearly unsuitable if the user wants other threads to
continue to execute while she stops one for inspection.
C4) Finally, stopping all threads simplifies GDB's remote protocol.
At present, GDB's remote protocol notifies GDB of exactly one
thread's state in response to each 'continue' or 'step' operation,
permitting no further packets from the stub until GDB resumes
some thread.
C5) GDB breakpoints are currently per-thread or global. To satisfy
our client's requirements, we must adapt these structures to
distinguish per-process and global breakpoints, where 'global'
breakpoints are set in all attached processes.
C6) [Our client elected not to address this issue yet.]
C7) GDB currently operates on a single process at a time: the list of
known threads is global, and the ID of the process being debugged
is global. This conflicts with the needs of multi-process
debugging.
C8) GDB currently maintains a single global map of the address space.
It cannot represent multiple processes with code and data
appearing at different addresses in different processes. This is
not a problem for our client, because code and variables appear at
the same addresses in all processes on their system. However, it
is a requirement for multi-process debugging on Linux.
C9) GDB will not currently relocate different segments of an
executable or shared library by different offsets from the
addresses they are assigned in the ELF file. The client's
operating system may relocate each section of a load module by a
different amount.
Projects
This section breaks down the work necessary into well-defined
engineering tasks. For each proposed project, we explain the work
entailed, the benefits provided, and how it depends on other projects,
if at all.
P1) Non-stop multi-threaded native debugging
This project allows GDB to stop one thread for inspection on a
native system while allowing others to run.
To prepare GDB to debug one process while other processes continue
to run freely (the feature our client is interested in), we will
first implement the ability to debug one thread while other
threads in that process continue to run freely.
As described in C1, C2, and C3, GDB assumes in its user interface
and code that no execution occurs while the user is inspecting a
thread's state. This project removes that simplifying assumption.
At the user interface level, GDB's Machine Interface ('MI', the
command set used by Eclipse) shall behave as follows:
- MI shall provide a command to allow the user to choose between
the older 'all-stop' and the new 'non-stop' multi-threaded
debugging behaviors. In all-stop mode, GDB shall behave as it
does now. The following points describe non-stop debugging
mode.
- GDB shall always prompt for and respond to MI commands,
regardless of whether any threads are running or not.
- When a thread finishes a command like '-exec-next' or
'-exec-finish', hits a breakpoint, or encounters a fault, GDB
shall stop that thread, without affecting the other threads in
the process.
- Execution commands like '-exec-continue' and '-exec-step' shall
resume only the selected thread, without affecting the other
threads in the process.
- The MI '-exec-interrupt' command shall stop all threads. This
will always generate an 'EXEC-ASYNC-OUTPUT' record, even if all
threads were already stopped. (This helps users handle the case
where the thread stops of its own accord just as the user sends
it an '-exec-interrupt' command.)
- The MI '-thread-select' command shall stop the thread selected,
if it is running. The previously selected thread is left in its
former state, either stopped or running. A '-thread-select'
command shall always generate an 'EXEC-ASYNC-OUTPUT' record,
even if the thread was already stopped.
- MI shall provide a command to continue all stopped threads.
- GDB shall send 'EXEC-ASYNC-OUTPUT' MI records to notify the user
of events that have occurred in threads, even while GDB is
waiting for an MI command. Every thread GDB stops shall be
mentioned in some 'EXEC-ASYNC-OUTPUT' record; when GDB stops all
threads, the EXEC-ASYNC-OUTPUT record shall include a
'thread-id="all"' result.
- The MI '-thread-info' and '-thread-list-all-threads' commands
shall be implemented. Their output shall indicate whether each
thread listed is currently stopped by GDB, or whether it is
allowed to run.
- GDB shall use 'EXEC-ASYNC-OUTPUT' MI records to report thread
creation and termination. These records shall include the GDB
thread number as a result. After sending a thread termination
record, GDB shall not include the thread in the output of
'-thread-list-ids' or '-thread-list-all-threads'.
(Adapting GDB's command-line interface to non-stop debugging is
more involved; whereas MI need only be accurate and sufficient,
the command-line interface must also respect human interface
issues. Since GDB's command-line interface is of limited interest
to our client, we have not included it here.)
To implement the behavior described above, a number of areas
within GDB will need modification:
- GDB's event loop must be responsive to user input and thread
events from the debuggee simultaneously.
- GDB's execution control code must avoid stopping all threads
when one reports an event, and must make the processing of
thread stops independent of resumption: it must no longer assume
that events only arrive after resumptions, and resumptions only
happen after events.
- GDB must insert breakpoints into code being executed by live
threads in a manner supported by the target architecture.
- GDB's breakpoint support code must leave breakpoints inserted at
all times. Even while GDB steps a thread past a breakpoint,
the breakpoint must remain in effect for all other threads.
These are each reasonably substantial pieces of work, the design
of which should be discussed on the public GDB list to ensure that
the work will be acceptable for inclusion in the public sources
when it is complete.
P4) Stub for client's OS
This project will mostly be non-GDB work. However, there are some
changes to the remote protocol we would like to introduce at this
point:
The remote protocol presently leaves the process to be debugged
implicit; users generally specify it when they start the stub.
However, to satisfy our client's requirements, we must be able to
connect to a system, list the processes present, and attach to one
of them. This entails making some straightforward extensions to
the GDB remote protocol, and thus to GDB as well.
The stub for our client should use the 'library' stop reply
packets and the 'qXfer:libraries:read' packet to report load
module events. However, because the client's OS may bring each
section of a load module into memory at a different offset from
the VMA given in the ELF file, we will need to extend the format
of the library list the latter packet returns, as it currently
assumes that each library needs only one offset, and extend GDB to
allow each segment to appear at a different offset (C9).
P6) Multi-threaded limited-multi-process native debugging
This project provides multi-threaded debugging of multiple
processes simultaneously. The debugger stops all threads in all
attached processes while the user inspects the state of any
thread. This work is independent of P1; we combine P1 and P6 in
the next project, P7.
At the user interface level:
- MI shall provide new commands to attach and detach a process;
unlike GDB's existing 'attach' and 'detach' commands, the new
'attach' command will not require GDB to detach from any
currently attached processes.
- MI shall provide a command to list all currently attached
processes.
- MI shall provide a command to list all the threads in a given
attached process.
- The output of the MI '-thread-info' and
'-thread-list-all-threads' commands shall include the process ID
of each thread listed. The process ID shall be a separate MI
'result' from the string provided by the
'target_extra_thread_info' function, so that Eclipse can access
it reliably.
- GDB shall stop all threads in all attached processes while
interacting with the user. Attaching to a process shall stop
all threads in that process. Detaching from a process shall
allow its threads to run again.
- MI shall report faults encountered by threads in any attached
process.
- MI shall report the termination of any attached process. After
such a report, GDB will no longer be attached to the process.
- MI's '-thread-select' command shall be able to select any thread
in any attached process.
- MI's existing breakpoint commands shall set breakpoints global
to all attached processes.
To support those facilities, we need the following changes:
- GDB shall maintain a table of attached processes. The remote
protocol shall provide packets directing the stub to attach to a
new process, and to detach from a currently attached process.
- The remote protocol shall carry process IDs as well as thread
IDs in stop reply packets, thread selection packets, thread
enumeration packets, and wherever else is appropriate.
- The stub shall use the current general thread (as given by the
'Hg' packet) to determine which process's memory to access, as
it does now to determine which thread's registers to access.
GDB shall send 'Hg' packets as necessary before memory accesses,
as it does now for register accesses.
P7) Non-stop multi-threaded multi-process native debugging
This project allows GDB to attach to multiple processes
simultaneously. This builds on P1 and P6, and addresses C4 and
C7.
At the user interface level:
- MI shall provide a command to stop all the threads in a given
process, and a command to resume all stopped threads in a given
process.
Internally:
- The remote protocol shall provide a way to tell the stub to
leave other threads running after reporting an event in one
thread (non-stop behavior), and a way to tell the stub to stop
all threads when reporting an event in one thread (stop-all
behavior).
- The remote protocol shall allow the stub to respond to
commands while threads are running, and to report further thread
events after a thread has stopped. (This addresses C4.)
- The remote protocol shall provide ways to stop and start a
particular thread, and ways to start and stop all the threads in
a given process. The mechanisms for stopping threads and
processes shall allow GDB to behave correctly when a thread
stops or a process exits simultaneously with GDB sending the
command.
nathan
--
Nathan Sidwell :: http://www.codesourcery.com :: CodeSourcery