This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] (x86) Fix watchpoint using hardware breakpoint for some distro


Hi Xavier,

On 03/16/2018 02:06 PM, Xavier Roirand wrote:
> Running watch*.exp tests in gdb.base shows this:
> 
> on x86_64/Ubuntu 16.04:
> 
>  # of expected passes            2631
>  # of unexpected failures        0
> 
> on x86_64/Ubuntu CentOS 5.11:
> 
>  # of expected passes            2535
>  # of unexpected failures        96
> 
> The problem can be easily shown in a debug session:
> 
> (gdb) watch val
> Hardware watchpoint 2: val
> (gdb) c
> Continuing.
> Program received signal SIGTRAP, Trace/breakpoint trap.
> ...
> 
> Whereas it should be:
> 
> (gdb) watch val
> Hardware watchpoint 2: val
> (gdb) c
> Continuing.
> val before change = 0
> 
> Hardware watchpoint 2: val
> 
> Old value = ...
> New value = ...
> 
> The Linux target and gdbserver now check the siginfo si_code
> reported on a SIGTRAP to detect whether the trap indicates
> a hardware breakpoint was hit.
> 
> Unfortunately, on some distro (CentOS 5, Suse 11) the returned
> si_code value is not equal to TRAP_HWBKPT when a hardware breakpoint
> is hit thus the hardware breakpoint is not handled as it should
> be, which is also the case for watchpoint when based on hardware
> breakpoint.

A few things are missing here:

#1 - kernel versions where this was observed.

#2 - If it's not equal to TRAP_HWBKPT, then what's it equal to?
     I assume zero?

Take a look at the big comment and table in nat/linux-ptrace.h -- is
this is the only case that is different on these kernels?  

I think that we should update the table a bit here, at least
something like:

 -   | hardware breakpoints/watchpoints         | TRAP_HWBKPT |
 +   | hardware breakpoints/watchpoints         | TRAP_HWBKPT (*) |

 (*) - Kernels x.y.z (CentOS 5, Suse 11) leave this as zero.

If other cases are different, then that might affect how to best
address this.

> 
> This patch adds an additional check when the inferior reported
> a SIGTRAP in order to detect this case.
> 
> No test have been created since all the existing ones are enough
> to validate the fix. BTW, with this fix, the tests results for
> the watchpoint tests are (for CentOS 5.11):
> 
>  # of expected passes            2630
>  # of unexpected failures        1
> 
> The remaining failure is located in watch-vfork test which explicitly
> disable the use of hardware breakpoint which is out of scope here.
> 
> gdb/ChangeLog:
> 
>         * linux-nat.c (save_stop_reason): Add an additional check
>         to detect hardware watchpoint.
> 
> gdbserver/ChangeLog:
> 
>         * linux-low.c (save_stop_reason): Add an additional check
>         to detect hardware watchpoint.
> 
> For R309-004
> 
> Test: x86_64/gdb      /ubuntu 16.04
>       x86_64/gdbserver/ubuntu 16.04
>       x86   /gdbs     /ubuntu 16.04
>       x86   /gdbserver/ubuntu 16.04
>       x86_64/gdb      /centos 5.11
>       x86_64/gdbserver/centos 5.11
>       x86   /gdb      /centos 5.11
>       x86   /gdbserver/centos 5.11
> 
> Change-Id: I2546aca9827d9ae12ab86deb7aa4acc60c82b4b4
> ---
>  gdb/gdbserver/linux-low.c | 7 +++++++
>  gdb/linux-nat.c           | 7 +++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
> index 2e5e19d..fe61026 100644
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -866,6 +866,13 @@ save_stop_reason (struct lwp_info *lwp)
>  	      if (!check_stopped_by_watchpoint (lwp))
>  		lwp->stop_reason = TARGET_STOPPED_BY_SINGLE_STEP;
>  	    }
> +	  else
> +	    {
> +	      /* The si_code is not TRAP_HWBKPT whereas it should
> +	         on some distro (CentOS 5, Suse 11) so still check
> +	         if stopped due to watchpoint.  */
> +	      check_stopped_by_watchpoint (lwp);
> +	    }
>  	}
>      }
>  #else
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 1bbad7b..0d97aa1 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -2798,6 +2798,13 @@ save_stop_reason (struct lwp_info *lp)
>  		 the debug registers separately.  */
>  	      check_stopped_by_watchpoint (lp);
>  	    }
> +	  else
> +	    {
> +	      /* The si_code is not TRAP_HWBKPT whereas it should
> +	         on some distro (CentOS 5, Suse 11) so still check
> +	         if stopped due to watchpoint.  */

This comment only makes complete sense with the context in the
git log in mind:

- This code is run by all architectures, so the comment should mention x86.
- The comment reads a bit backwards to me -- talks about what it should
  be before talking about watchpoints.

I'd suggest something like this:

   /* On some kernels (such as x86-64 x.y.z/CentOS 5, x.y.z/Suse 11),
      when we continue into a watchpoint, si_code indicates 0 instead of
      TRAP_HWBKPT so we need to check debug registers separately.  */

Does the step-into-watchpoint case result in TRAP_TRACE, or does
that result in 0 too?  That affects the "continue" in the comment above.

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]