This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64)
- From: Sergio Durigan Junior <sergiodj at redhat dot com>
- To: Pedro Alves <palves at redhat dot com>
- Cc: Patrick Palka <patrick at parcs dot ath dot cx>, "gdb-patches\ at sourceware dot org" <gdb-patches at sourceware dot org>
- Date: Thu, 13 Aug 2015 20:46:06 -0400
- Subject: Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64)
- Authentication-results: sourceware.org; auth=none
- References: <1433878062-23560-1-git-send-email-patrick at parcs dot ath dot cx> <1434466413-28892-1-git-send-email-patrick at parcs dot ath dot cx> <87mvym67i5 dot fsf_-_ at redhat dot com> <CA+C-WL-Ujsn1ccGNS-wD=RUdbJF1DcGxVYSLU6ubcGThdCQXYg at mail dot gmail dot com> <87zj1un7ro dot fsf at redhat dot com> <55CD27AE dot 2090002 at redhat dot com>
On Thursday, August 13 2015, Pedro Alves wrote:
> Here's a theory:
>
> (gdb) server show commands
> 1 set height 0
> 2 set width 0
> 3 target extended-remote localhost:2348
> 4 print 1
> 5 monitor exit
> 6 set height 0
> 7 set width 0
> 8 target extended-remote localhost:2349
> (gdb) PASS: gdb.base/gdbinit-history.exp: truncation: creating: server show commands
> Remote debugging from host 127.0.0.1
> monitor exit
> (gdb) spawn /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/gdb-buildbot/fedora
> -ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../data-directory -x /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/bu
> ild/gdb/testsuite/outputs/gdb.base/gdbinit-history/gdbinit-history.gdbinit -ex set auto-connect-native-target off
> GNU gdb (GDB) 7.10.50.20150812-cvs
> ...
> (gdb) set height 0
> (gdb) set width 0
> (gdb) spawn ../gdbserver/gdbserver --once --multi :2350
> Listening on port 2350
> target extended-remote localhost:2350
> Remote debugging using localhost:2350
> (gdb) server show commands
> 1 set height 0
> 2 set width 0
> 3 target extended-remote localhost:2350
> (gdb) FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands
>
> Observations:
>
> #1 - In the FAILing case, the history only shows commands invoked in that
> gdb session. That means that either that GDB invocation failed to
> load the history file, or the previous invocation failed to write it.
Yeah, it seems to me it's the second option: GDB failed to write the
history file.
> #2 - dejagnu's standard_close, what is ultimately used to exit gdb,
> does this:
>
> 2.1 - send gdb a SIGINT. This might help by interrupting any command
> that gdb might be stuck in, getting it back to
> the event loop processing input.
>
> 2.2 - close gdb's ptty, in effect closing gdb's stdin/stdout.
>
> 2.3 - if gdb doesn't exit after 5 seconds, kills it
> with SIGTERM. If it still doesn't exit after another 5
> seconds, dejagnu kill it with SIGKILL.
>
> Unless you're running dejagnu git master, 2.1 and 2.2 happen
> in parallel.
>
> I'm assuming that 2.3 is not happening. It's 2.2 that normally
> causes gdb to exit. See event-top.c:stdin_event_handler.
>
> - Because, due to 2.2, stdin/stdout are already closed when gdb is
> running the teardown code (quit_force, what saves history), any
> warning/output that gdb might output goes nowhere, it won't
> appear in gdb.log.
>
> #3 - If you'll recall, not long ago we made the code that appends
> the commands to the history file roughly:
>
> - mv old-history-file old-history-file.tmp
> - append to old-history-file.tmp
> - mv old-history-file.tmp old-history-file
>
> That is top.c:gdb_safe_append_history.
>
> AFAICS, SIGINT is not set to SA_RESTART. Now imagine what happens
> if the SIGINT lands exactly just when gdb is about to call
> rename here, in gdb_safe_append_history:
>
> ret = rename (local_history_filename, history_filename);
> saved_errno = errno;
> if (ret < 0 && saved_errno != EEXIST)
> warning (_("Could not rename %s to %s: %s"),
> local_history_filename, history_filename,
> safe_strerror (saved_errno));
> }
>
> and rename fails with EINTR. Then the following gdb invocation
> would not find any previous history file to load, which would
> perfectly explain that FAIL.
>
> If we still have the build/test directory of that test run,
> if the theory is right, I think we'd find a
> leftover gdbinit-history.gdb_history-gdb-$PID~ file.
Unfortunately we don't have the build directory available anymore,
because BuildBot deletes it in the next build. Even if we did, it would
require permission/help from the buildslave admin to access it...
> What doesn't look right in the theory is that the rename man
> does not document rename as failing with EINTR... But
> maybe it does?
Yeah, well... I've made a few tests here, and I could not trigger EINTR
with rename. I have also looked at the Linux kernel source code and I
don't see (at least directly) a way for rename to error with EINTR:
<http://lxr.free-electrons.com/source/fs/namei.c#L4398>
It really is an atomic operation. Of course, there might be some quirks
on PPC GNU/Linux that I'm not aware of...
--
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF 31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/