This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64) (was: Re: [PATCH] Don't truncate the history file when history size is unlimited)


On Thu, Jul 23, 2015 at 3:33 PM, Patrick Palka <patrick@parcs.ath.cx> wrote:
> On Thu, Jul 23, 2015 at 2:42 PM, Sergio Durigan Junior
> <sergiodj@redhat.com> wrote:
>> On Tuesday, June 16 2015, Patrick Palka wrote:
>>
>>> We still do not handle "set history size unlimited" correctly.  In
>>> particular, after writing to the history file, we truncate the history
>>> even if it is unlimited.
>>>
>>> This patch makes sure that we do not call history_truncate_file() if the
>>> history is not stifled (i.e. if it's unlimited).  This bug causes the
>>> history file to be truncated to zero on exit when one has "set history
>>> size unlimited" in their gdbinit file.  Although this code exists in GDB
>>> 7.8, the bug is masked by a pre-existing bug that's been only fixed in
>>> GDB 7.9 (PR gdb/17820).
>>
>> Hey Patrick,
>>
>> Looking at the BuildBot logs today, I found that this new test is
>> failing occasionally on native-extended-gdbserver testing.  Take a look
>> at the following build:
>>
>>   <http://gdb-build.sergiodj.net/builders/Debian-x86_64-native-extended-gdbserver-m64/builds/1429>
>>
>> You can see that gdb.base/gdbinit-history.exp failed:
>>
>>   PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands
>>   PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: creating: server show commands
>>
>> The gdb.log is here:
>>
>>   <http://gdb-build.sergiodj.net/cgit/Debian-x86_64-native-extended-gdbserver-m64/.git/plain/gdb.log?id=2abe37b834f73838c68e1f843bdd612cef4a2ae3>
>>
>> I haven't really investigated to determine what's going on here, but let
>> me know if you need any help with this.
>
> Thanks for the heads up.
>
> When doing gdb_exit followed by gdb_start, in the output log sometimes
> we have (this is printed shortly before the first FAIL)
>
> (gdb) ...
> Remote debugging from host 127.0.0.1
> monitor exit
> spawn ...
> GNU gdb (GDB) 7.10.50.20150723-cvs
> ...
>
> Other times we have (this is printed shortly before the second FAIL)
>
> (gdb) ...
> Remote debugging from host 127.0.0.1
> monitor exit
> (gdb) spawn ...
> GNU gdb (GDB) 7.10.50.20150723-cvs
> ...
>
> The literal difference being the "(gdb) " prompt printed before the
> "spawn" message.  In the first case (where the "(gdb) " prefix is not
> there) the history file does not seem to be written/appended to.  In
> the second case (when the "(gdb) " prefix is there) the history file
> is properly written/appended to (but it still FAILs because we're
> missing the command history from before the first case).  So the race,
> if there is one, may have something to do with whether or not the
> "(gdb) " prompt gets printed after doing "monitor exit".  Or maybe
> not.  I'll do more analysis later.

After further analysis I don't think there is any correlation between
whether "(gdb) spawn ..." or  else "spawn ..." gets printed, and if a
race between "gdb_exit" and "gdb_start" occurs.  In fact, I don't
think there even is a race between gdb_exit and gdb_start.  If I add
"after 100" (i.e. TCL's way of sleeping 100ms) in the middle of such
sequences (i.e. gdb_exit; after 100; gdb_start) in
gdbinit-history.exp, I can still reproduce the intermittent FAILs.

So it may be the case that sometimes gdb_exit does not kill the GDB
process properly, and by doing so the process doesn't get a chance to
save to the history file.  And it only happens with
extended-gdbserver, not with gdbserver or non-gdbserver.  A unique
code path taken only by the extended-gdbserver target is the code in
gdbserver-support.exp:gdb_exit guarded by "if {[info exists
gdb_spawn_id] && [info exists server_spawn_id]}", and then the
close_gdbserver proc that follows.  However if I just outright delete
that code (which should make the exit logic nearly identical with the
gdbserver target) I can still trigger the intermittent FAILs...  Maybe
it's an issue in dejagnu?  Or could it be an obscure bug in GDB??


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]