This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: how to fix internal errors on connection to remote stub?
- From: Sandra Loosemore <sandra at codesourcery dot com>
- To: Paul Koning <paul_koning at dell dot com>
- Cc: "gdb at sourceware dot org" <gdb at sourceware dot org>, Luis Machado <lgustavo at codesourcery dot com>
- Date: Fri, 23 Jan 2015 12:37:03 -0700
- Subject: Re: how to fix internal errors on connection to remote stub?
- Authentication-results: sourceware.org; auth=none
- References: <54C2893B dot 5000900 at codesourcery dot com> <EBF95162-046E-4C79-8FA0-48D190939B38 at dell dot com>
On 01/23/2015 11:07 AM, Paul Koning wrote:
If gdbserver is sending something that confuses gdb, the default
answer is that this is a gdb bug (it should not fall over) and
possibly in addition a gdbserver bug (it should obey the protocol
spec). The reason I say “default answer” is because of the standard
distributed systems rule that it’s always your bug if a received
packet causes you to malfunction; the fact that the packet was
invalid is not an excuse.
You said that the stub is in an “inconsistent state”. I’m not sure
about that. The target is stopped by the initial connection, and at
that point you have a target thread, it’s stopped, it has registers,
so it’s in some state that can be reported. Yes, that state has no
connection to the program GDB knows about, because it’s not in the
target yet. So the target might be in some boot loader or other bit
of skeleton code, but it’s obviously executing something. So I don’t
think “inconsistent” applies from the gdbserver point of view.
Hmmm, I'm not so sure about this. In the situations where we have been
hitting this problem, a more exact description of what is going on in
the stub is this: it previously completed normal execution of some
other program in a different gdb instance and sent a 'W' packet. When a
new gdb instance reconnects to the stub, the target is still sitting
stopped at the semihosting breakpoint that triggered the 'W' packet.
That's why I'm wondering whether the response it should be giving to the
initial '?' packet on the new connection should be 'W' ("the program has
exited and has no meaningful state any more") instead of 'S' ("the
program is stopped"). But GDB only accepts a 'W' reply to '?' in
extended-remote mode, which isn't supported by this stub.
Instead, it seems that gdb, when it queries gdbserver for the stopped
inferior state, gets back stuff that doesn’t fit in the program it’s
been told about. But so what? That can happen in other places for
other reasons, and gdb usually handles that just fine. Consider the
“heuristic fencepost” machinery that protects from wild backtraces.
So it seems that we just have some gaps in gdb’s robustness, and
those are bugs that should be fixed.
New commands or new protocol mechanisms don’t seem like the right
answer; it’s not the user’s job to work around gdb bugs, nor is it
gdbserver’s job to know that it is out of sync with gdb.
It does seem like GDB could do a better job here of checking that the
code in target memory (e.g., the instruction at the reported PC) matches
what's expected from the program it's trying to debug. While fixing
that might make these errors less likely, it wouldn't be as foolproof as
the stub simply telling GDB that it definitely has no useful program
state to report yet.
-Sandra