This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: "finish" command leads to SIGTRAP

From: John Baldwin <jhb at FreeBSD dot org>
To: Pedro Alves <palves at redhat dot com>, David Griffiths <dgriffiths at undo dot io>
Cc: gdb at sourceware dot org
Date: Fri, 22 Feb 2019 08:41:54 -0800
Subject: Re: "finish" command leads to SIGTRAP
References: <CA++j6c7bhP7=AWWdzBajC7KUah1gwj25-Bpoqd1-Xs_67z5kVw@mail.gmail.com> <b437759f-5223-fcd2-1e68-f77051a2f910@redhat.com> <CA++j6c4co2Uz=3q952pf3skq8xMdsmQ_4+H6eMBkAdvyXMzK8A@mail.gmail.com> <CA++j6c6A0DMRwOGBJ4XGGWnC49D1Wi9J9kz6kzSmaj+kiJXo-w@mail.gmail.com> <78e1f522-f6f5-d38d-0644-d083c1e4ab5d@redhat.com> <CA++j6c7Vazk-A3e8T5xzMNJzhorB9DW-y8Hye7v-rwUKVnEzgA@mail.gmail.com> <743edbbc-9812-c8e7-0f47-7b4842199b48@redhat.com> <863e96ac-83c4-feb3-e412-95b647d18201@FreeBSD.org> <9e24e676-ff37-bf4b-3fd0-a9fda0798abb@redhat.com> <6c1aae34-5bce-a524-11d6-0e12b53b9ac2@FreeBSD.org> <257d6eda-21b5-970f-0fe4-f96fefe56a44@redhat.com>

On 2/22/19 7:09 AM, Pedro Alves wrote:
> On 02/21/2019 08:49 PM, John Baldwin wrote:
>> On 2/21/19 11:34 AM, Pedro Alves wrote:
>>> #3 - have gdb always clear TF after a single-step.  This is the
>>>    easiest, even if the "less technically cool" solution.  This
>>>    would mean that it'd be impossible to debug a program that
>>>    sets the trace flag manually.  I've actually once co-wrote
>>>    an in-process x86 debug stub, and in that use case
>>>    preserving TF mattered, made it possible to debug that
>>>    stub...  Quite a niche use case, though, and it'd have been
>>>    trivial for me for hack gdb for that special use case, of course.
>>>
>>> In order for GDB to know whether it is stepping a pushf instruction,
>>> it needs to read the memory at PC, which has a cost, but maybe it's
>>> negligible if we already end up reading memory anyway (because of the
>>> code cache), but I'm not sure we already do.  This can have a more
>>> noticeable effect with remote debugging (which should weigh on whether
>>> to do the workaround at the infrun.c level, or in the target backend (thus
>>> in gdbserver when remote).
>>>
>>> Solution #3 would require extra ptrace commands anyway (read-modify-write
>>> the flags), so it may end up being less performant, if #1 and #2 already
>>> hit the code cache.
>>>
>>> There are some extra complications around #1 and #2 for gdbserver,
>>> because we need to consider the cases when gdbserver handles 
>>> single-stepping without roundtripping to gdb:
>>>
>>>   - range-stepping
>>>   - stepping over breakpoints/tracepoints
>>
>> Hmmm, I will probably try to fix (or get someone else to fix) FreeBSD's
>> kernel regardless probably using the approach in #1.  For GDB itself, I
>> probably have a slight preference for #2 over #1, but I haven't yet worked
>> with gdbserver, so I'd defer to you on if #3 is the best solution when
>> taking gdbserver into account.  If the edge case of #3 matters, (which might
>> matter for some other things like some language runtimes that set TF and use
>> SIGTRAP handlers that motivated FreeBSD's kernel changes last year), we
>> could perhaps provide a way for targets to override #3 if they know they
>> don't need it (e.g. a native target under a kernel known to work).  Not
>> sure how that would work over remote (e.g. if you would want gdbserver to
>> internalize this behavior so that only it deals with it and hides it from
>> the remote debugger).
> 
> I'd prefer #1 or #2 over #3.  As for gdbserver, the thing is that whatever
> solution we implement in gdb isn't going to fix gdbserver, gdbserver
> needs fixing as well.  gdbserver has its own run control loop that does
> single-stepping behind gdb's back.  The most common case nowadays is
> range-stepping.  When you do "next", or "step", as an optimization, gdb
> tells gdbserver to single-step as long the PC is within an address range
> (the continuous address range that corresponds to the current line
> that includes PC).  gdbserver then continually single-steps, and only
> reports back a stop to GDB once the PC leaves the range.  This avoids
> many roundtrips between gdb and gdbserver.  This means that gdbserver
> must have some workaround too.  For this case alone, we could just
> make gdbserver punt and report a stop to gdb if the next instruction is
> a pushf (gdb continues stepping itself, which would trigger the workaround).
> BUT, that wouldn't address the less frequent case -- tracepoints:
> gdbserver needs to step over them without gdb involvement, and needs to
> implement while-stepping actions.  So here we can't punt to gdb, there
> may not even be one connected!  So we need to a full workaround
> in gdbserver.

I thought of one more issue with #3 which is that it's not necessarily that
you need to clear TF after each step.  The way I reproduced this when I ran
the test program was to si over the pushf, then do a continue.  This meant
that we weren't stepping when the popf was executed, and the instruction
after popf then raised a spurious SIGTRAP.  At that point, the thread's
current state isn't stepping.  One way perhaps to handle this was if you
could specifically determine that a SIGTRAP was a step and if the you get
an unexpected step trap, resume the thread anyway (possibly clearing TF as
part of the resume).  This wouldn't be hard to do in individual native
targets where you have the siginfo for the SIGTRAP.  It's harder to do at a
higher layer I think.  One thing I've wondered about when adding the siginfo
parsing for the FreeBSD native target is that it feels like it would be
nicer if a target could return more fine-grained waitkinds, something like
TARGET_WAITKIND_STEPPED, TARGET_WAITKIND_SW_BREAKPOINT, etc. instead of
requiring the various methods like 'supports_stopped_by_sw_breakpoint' and
'stopped_by_sw_breakpoint' and assuming that SIGTRAP is a step if the current
thread is stepping and none of the other 'stopped_by_foo' methods return
true.  You could maybe still have a fallback for TARGET_WAITKIND_STOPPED that
would use the same heuristics for targets that don't parse siginfo to infer
the more detailed stop type perhaps?  Having that detail at a higher level
would make it easier to recognize spurious step traps in the core I think.
That's probably too big a change just to workaround this issue, but still a
thought I've had for a while.

-- 
John Baldwin

Follow-Ups:
- Re: "finish" command leads to SIGTRAP
  - From: David Griffiths

References:
- "finish" command leads to SIGTRAP
  - From: David Griffiths
- Re: "finish" command leads to SIGTRAP
  - From: Pedro Alves
- Re: "finish" command leads to SIGTRAP
  - From: David Griffiths
- Re: "finish" command leads to SIGTRAP
  - From: David Griffiths
- Re: "finish" command leads to SIGTRAP
  - From: Pedro Alves
- Re: "finish" command leads to SIGTRAP
  - From: David Griffiths
- Re: "finish" command leads to SIGTRAP
  - From: Pedro Alves
- Re: "finish" command leads to SIGTRAP
  - From: John Baldwin
- Re: "finish" command leads to SIGTRAP
  - From: Pedro Alves
- Re: "finish" command leads to SIGTRAP
  - From: John Baldwin
- Re: "finish" command leads to SIGTRAP
  - From: Pedro Alves

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]