This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64)


On Thursday, August 13 2015, Pedro Alves wrote:

> Here's a theory:
>
> (gdb) server show commands
>     1  set height 0
>     2  set width 0
>     3  target extended-remote localhost:2348
>     4  print 1
>     5  monitor exit
>     6  set height 0
>     7  set width 0
>     8  target extended-remote localhost:2349
> (gdb) PASS: gdb.base/gdbinit-history.exp: truncation: creating: server show commands
> Remote debugging from host 127.0.0.1
> monitor exit
> (gdb) spawn /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/gdb-buildbot/fedora
> -ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../data-directory -x /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/bu
> ild/gdb/testsuite/outputs/gdb.base/gdbinit-history/gdbinit-history.gdbinit -ex set auto-connect-native-target off
> GNU gdb (GDB) 7.10.50.20150812-cvs
> ...
> (gdb) set height 0
> (gdb) set width 0
> (gdb) spawn ../gdbserver/gdbserver --once --multi :2350
> Listening on port 2350
> target extended-remote localhost:2350
> Remote debugging using localhost:2350
> (gdb) server show commands
>     1  set height 0
>     2  set width 0
>     3  target extended-remote localhost:2350
> (gdb) FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands
>
> Observations:
>
> #1 - In the FAILing case, the history only shows commands invoked in that
>   gdb session.  That means that either that GDB invocation failed to
>   load the history file, or the previous invocation failed to write it.

Yeah, it seems to me it's the second option: GDB failed to write the
history file.

> #2 - dejagnu's standard_close, what is ultimately used to exit gdb,
>   does this:
>
>  2.1 - send gdb a SIGINT.  This might help by interrupting any command
>    that gdb might be stuck in, getting it back to
>    the event loop processing input.
>
>  2.2 - close gdb's ptty, in effect closing gdb's stdin/stdout.
>
>  2.3 - if gdb doesn't exit after 5 seconds, kills it
>      with SIGTERM.  If it still doesn't exit after another 5
>      seconds, dejagnu kill it with SIGKILL.
>
>  Unless you're running dejagnu git master, 2.1 and 2.2 happen
>  in parallel.
>
>  I'm assuming that 2.3 is not happening.  It's 2.2 that normally
>  causes gdb to exit.  See event-top.c:stdin_event_handler.
>
> - Because, due to 2.2, stdin/stdout are already closed when gdb is
>   running the teardown code (quit_force, what saves history), any
>   warning/output that gdb might output goes nowhere, it won't
>   appear in gdb.log.
>
> #3 - If you'll recall, not long ago we made the code that appends
>      the commands to the history file roughly:
>
>    - mv old-history-file old-history-file.tmp
>    - append to old-history-file.tmp
>    - mv old-history-file.tmp old-history-file
>
> That is top.c:gdb_safe_append_history.
>
> AFAICS, SIGINT is not set to SA_RESTART.  Now imagine what happens
> if the SIGINT lands exactly just when gdb is about to call
> rename here, in gdb_safe_append_history:
>
>       ret = rename (local_history_filename, history_filename);
>       saved_errno = errno;
>       if (ret < 0 && saved_errno != EEXIST)
>         warning (_("Could not rename %s to %s: %s"),
> 		 local_history_filename, history_filename,
> 		 safe_strerror (saved_errno));
>     }
>
> and rename fails with EINTR.  Then the following gdb invocation
> would not find any previous history file to load, which would
> perfectly explain that FAIL.
>
> If we still have the build/test directory of that test run,
> if the theory is right, I think we'd find a
> leftover gdbinit-history.gdb_history-gdb-$PID~ file.

Unfortunately we don't have the build directory available anymore,
because BuildBot deletes it in the next build.  Even if we did, it would
require permission/help from the buildslave admin to access it...

> What doesn't look right in the theory is that the rename man
> does not document rename as failing with EINTR...  But
> maybe it does?

Yeah, well...  I've made a few tests here, and I could not trigger EINTR
with rename.  I have also looked at the Linux kernel source code and I
don't see (at least directly) a way for rename to error with EINTR:

  <http://lxr.free-electrons.com/source/fs/namei.c#L4398>

It really is an atomic operation.  Of course, there might be some quirks
on PPC GNU/Linux that I'm not aware of...

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]