This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64)


On 08/17/2015 02:28 PM, Patrick Palka wrote:

> Ah, you already addressed this: a warning is not emitted because
> stdout is closed..

Yeah.

> But because the problem only occurs under extended-gdbserver, I'm
> inclined to think the issue is with the testsuite driver, in
> particular with the gdb_exit implementation in
> lib/gdbserver-support.exp.  One potential issue I notice in this proc
> is that when we send "monitor exit" to GDB, we don't necessarily wait
> for the command to finish (i.e. for the gdb prompt to get printed).
> As soon as the server is observed to get killed, we continue with
> exiting.  Dunno if that's substantial..

That's very plausible, at least.

Maybe that prompt got stuck in the expect buffer, and it confused
something else later on?

Another theory related to that could be that the new GDB started just
while the previous gdb is saving history and has just momentarily
renamed the history file to gdbinit-history.gdb_history-gdb-$PID~.
But AFAICS, that shouldn't happen because that gdb_exit calls
gdbserver_orig_gdb_exit at the end, which only returns after
the previous gdb exits...

Did anyone ever manage to reproduce this?

One thing I'd try is making dejagnu's local_exec (close_wait_program in master)
print the result of the "wait -i".  That will show whether gdb exited
due to a normal exit, or whether it was killed by SIGTERM or SIGKILL.
And then I'd try hacking gdb_safe_append_history to output debug logs
to a file instead of stdout (e.g., /tmp/gdb-log).

Another would be to add a "show history filename" to the test, to make sure
that the gdb that fails to load the previous history actually tried to
read the file we expect it to be reading.

Also, I think it's time to try to get all the buildslaves to use
dejagnu master, to pick up http://lists.gnu.org/archive/html/dejagnu/2015-07/msg00005.html.
Who knows, maybe that race/rogue kill could also explain this problem.
The x86_64 Fedora slaves have been running with that for a while, and
we no longer see attach-many-short-lived-threads.exp failures there, and
we keep seeing them on the other slaves (which don't have that fix).

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]