This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH] Displaced stepping (non-stop debugging) support for ARM Linux

From: Daniel Jacobowitz <drow at false dot org>
To: Julian Brown <julian at codesourcery dot com>
Cc: gdb-patches at sourceware dot org, pedro at codesourcery dot com
Date: Tue, 9 Jun 2009 13:37:09 -0400
Subject: Re: [PATCH] Displaced stepping (non-stop debugging) support for ARM Linux
References: <20090120221355.46ac23e6@rex.config> <20090202200107.GA26459@caradoc.them.org> <20090516191910.14820805@rex.config>
On Sat, May 16, 2009 at 07:19:10PM +0100, Julian Brown wrote:
> Pedro Alves wrote:
> 
> > Right, you may end up with a temporary breakpoint over another
> > breakpoint, though.  It would be better to use the standard software
> > single-stepping (set temp break at next pc, continue, remove break)
> > for standard stepping requests, and use displaced stepping only for
> > stepping over breakpoints.  Unfortunately, you don't get that for
> > free --- infrun.c and friends don't know how to handle multiple
> > simultaneous software single-stepping requests, and that is required
> > in non-stop mode.
> 
> I'm not sure what the status is here now. For testing purposes, I've
> (still) been using a local patch which uses displaced stepping for all
> single-step operations.

We still can't use software single-stepping simultaneously in multiple
threads.  Pedro, should we fix that or always use displaced stepping
for now?

> Daniel Jacobowitz <drow@false.org> wrote:
> 
> > * What's the point of executing mov<cond> on the target for BL<cond>?
> > At that point it seems like we ought to skip the target step entirely;
> > just simulate the instruction.  We've already got a function to check
> > conditions (condition_true).
> 
> I'm now using NOP instructions and condition_true, because the current
> displaced stepping support wants to execute "something" rather than
> nothing.

>From infrun.c:

   One way to work around [software single stepping]...
   would be to have gdbarch_displaced_step_copy_insn fully
   simulate the effect of PC-relative instructions (and return NULL)
   on architectures that use software single-stepping.

So the interface you need is there; it's just not implemented yet:

  /* We don't support the fully-simulated case at present.  */
  gdb_assert (closure);

I think the implementation strategy will look like:

  * Add another non-zero return value from displaced_step_prepare.

  * Update should_resume after the call, in resume (currently unused).

  * Ask Pedro how to pretend that the inferior resumed and stopped,
  for higher levels.  I think this will entail a new queue.  Bonus
  points if prepare_to_wait and wait_for_inferior do not invalidate
  the perfectly good register cache at this point.

Pedro, thoughts - easy or should we stick with the NOP workaround for
now?

I know that some debug interfaces simulate instructions aggressively,
to save the target round trip.

> > Yes, we just can't emulate loads or stores.  Anything that could cause
> > an exception that won't be delayed till the next instruction, I think.
> 
> LDM and STM are handled substantially differently now: STM instructions
> are let through unmodified, and when PC is in the register list the
> cleanup routine reads back the stored value and calculates the proper
> offset for PC writes. The true (non-displaced) PC value (plus offset) is
> then written to the appropriate memory location.

I am not happy about creating additional memory writes from GDB, but I
agree that for the special case of the PC this is good enough.

> LDM instructions shuffle registers downwards into a contiguous list (to
> avoid loading PC directly), then fix up register contents afterwards in
> the cleanup routine. The case with a fully-populated register list is
> still emulated, for now.

This seems OK too, by the same logic.

> > > +static int
> > > +copy_svc (unsigned long insn, CORE_ADDR to, struct regcache *regs,
> > > +	  struct displaced_step_closure *dsc)
> > > +{
> > > +  CORE_ADDR from = dsc->insn_addr;
> > > +
> > > +  if (debug_displaced)
> > > +    fprintf_unfiltered (gdb_stdlog, "displaced: copying svc insn
> > > %.8lx\n",
> > > +			insn);
> > > +
> > > +  /* Preparation: tmp[0] <- to.
> > > +     Insn: unmodified svc.
> > > +     Cleanup: if (pc == <scratch>+4) pc <- insn_addr + 4;
> > > +	      else leave PC alone.  */
> > 
> > What about the saved PC?  Don't really want the OS service routine to
> > return to the scratchpad.
> > 
> > > +  /* FIXME: What can we do about signal trampolines?  */
> > 
> > Maybe this is referring to the same question I asked above?
> > 
> > If so, I think you get to unwind and if you find the scratchpad,
> > update the saved PC.
> 
> I've tried to figure this out, and have totally drawn a blank so far.
> AFAICT, the problem we're trying to solve runs as follows: sometimes, a
> signal may be delivered to a process whilst it is executing a system
> call. In that case, the kernel writes a signal trampoline to the user
> program's stack space, and rewrites the state so that the trampoline is
> executed when the system call returns.

No, not quite.  That may be how things work on other platforms - this
varies quite a bit, and I'm only familiar with the Linux
implementation - but Linux does it differently.  There are no signal
*call* trampolines, only signal *return* trampolines.

First we step over a system call instruction.  An arbitrary amount of
work happens.  A signal may be received; if so, the process should
be stopped for GDB to inspect.  Here's where the first weird
single-stepping thing happens; we don't know if a handler is installed
for this signal, and we can't figure it out without races, so we don't
know where a single step would send us.  The kernel handles this on
platforms with PTRACE_SINGLESTEP.  Several GDB tests fail because of
this problem.  Anyway, that's a digression.

If a signal was received and that signal has a handler, then when the
system call finishes (either early, or at the normal time) then the
kernel sets up a trampoline.  The program state is saved on the stack,
the PC is changed to point at the handler, and the LR is changed to
point at the restorer.  The restorer can be some instructions on the
stack or a function in the C library with SA_RESTORER; I believe ARM
GLIBC uses SA_RESTORER exclusively.  The PC in the saved state
may point at the system call, before it, or after it - this depends
whether the system call completed or needs to be restarted.
Then the program resumes, runs the handler, and eventually calls
sigreturn.

Sigreturn is another special case since it changes sp/pc and does not
return to the following instruction.

I don't know if you can end up needing to restart a system call even
if no signal was received, but I don't think so.  Every architecture
has a restart convention.  In some cases it backs up one instruction,
to the system call; this is usually used if the system call
instruction encodes the syscall number, or if it's in a different
register than error codes.  Other architectures back up two
instructions and mandate a two-instruction system call sequence.  ARM
backs up one instruction; MIPS usually backs up two (not always, see
kernel/signal.c).

If we step over a system call, and reach the breakpoint at the next
instruction, we're golden.  If we receive a signal instead we have to
determine if the system call completed or not, probably based on the
apparent PC address.  We should be able to unclobber the PC here,
by a constant relative distance from the original instruction to the
scratchpad instruction.

I'll make this more concrete.  I took your test program and did this:

(gdb) b func2
Breakpoint 2 at 0x8644: file signal-step.c, line 48.
(gdb) b main
Breakpoint 3 at 0x84d8: file signal-step.c, line 14.
(gdb) r
Starting program: /home/dan/signal-step

Breakpoint 3, main (argc=1, argv=0xbeb798b4) at signal-step.c:14
14        printf ("Starting execution\n");
(gdb) x/i 0x400b5d78
0x400b5d78 <pause+24>:  svc     0x00000000
(gdb) b *0x400b5d78
Breakpoint 5 at 0x400b5d78
(gdb) c
Continuing.
Starting execution
sigaction() successful. Now sleeping

[Send signal 1 from other terminal]

Program received signal SIGHUP, Hangup.
0x400b5dfc in nanosleep () from /lib/libc.so.6
(gdb) x/2i $pc - 4
0x400b5df8 <nanosleep+24>:      svc     0x00000000
0x400b5dfc <nanosleep+28>:      mov     r7, r12
(gdb) p (int) $r0
$3 = -516

[Notice, we're apparently after the system call here... but $r0 is
ERESTART_RESTARTBLOCK.  So apparently at this point we are going
to restart the system call but have not adjusted the PC to do so
yet.  You can see from the kernel that because there is a handler,
ERESTART_RESTARTBLOCK and ERESTARTNOHAND will end up changing r0
to EINTR before running the signal handler.  ERESTARTSYS and
ERESTARTNOINTR could potentially adjust the PC to restart the
system call.  That PC adjustment has not happened yet.  If you're
looking at arch/arm/kernel/signal.c, the point where the debugger gets
control is inside get_signal_to_deliver; the PC is adjusted later in
do_signal if there was no handler, and in handle_signal if there was.

Anyway, the important point is that at this moment we don't know
whether the system call instruction will be executed again.  If
we adjust the PC back to the non-scratchpad location, the next
instruction could be any of:

* (Unknown) signal handler
* System call
* Instruction after system call

Complicated.]

(gdb) c
Continuing.
Signal Handler: sig=1 scp=0xbef45190
siginfo.si_signo=1
siginfo.si_errno=0
siginfo.si_code=0

Breakpoint 5, 0x400b5d78 in pause () from /lib/libc.so.6
(gdb) si

[No prompt.  We're inside pause now.  Send signal 2 from another
window.]

Program received signal SIGINT, Interrupt.
0x400b5d7c in pause () from /lib/libc.so.6
(gdb) p (int) $r0
$4 = -514

[We're after the syscall instruction now.  This is ERESTARTNOHAND, for
the semantics of pause - only return after a signal handler is run.]

(gdb) handle SIGINT pass
SIGINT is used by the debugger.
Are you sure you want to change it? (y or n) y
Signal        Stop      Print   Pass to program Description
SIGINT        Yes       Yes     Yes             Interrupt
(gdb) set $pc = 0x0
(gdb) c
Continuing.

Breakpoint 2, func2 (sig=2, sinf=0xbe989e00, foo=0xbe989e80) at
signal-step.c:48
48        printf ("Signal Handler: sig=%d scp=%p\n", sig, sinf);
(gdb) bt
#0  func2 (sig=2, sinf=0xbe989e00, foo=0xbe989e80) at signal-step.c:48
#1  <signal handler called>
#2  0x00000000 in ?? ()
#3  0x000085f8 in func (sig=1, sinf=0xbe98a190, foo=0xbe98a210) at signal-step.c:40
#4  <signal handler called>
#5  0x400b5dfc in nanosleep () from /lib/libc.so.6
#6  0x400b5ba8 in sleep () from /lib/libc.so.6
#7  0x00008568 in main (argc=1, argv=0xbe98a8b4) at signal-step.c:25

See what's happened?  My adjustment to the PC, at the time GDB sees
the signal, has ended up in the saved signal context.  So we need to
take advantage of that to remove the scratchpad from the signal
context.

Let me know if that session log straightened things out, or if my
explanation just confused the issue hopelessly.

> I've hit some problems testing this patch, mainly because I can't seem
> to get a reliable baseline run with my current test setup. AFAICT, there
> should be no affect on behaviour unless displaced stepping is in use
> (differences in passes/failures with my patch only seem to be in
> "unreliable" tests, after running baseline testing three times), and of
> course displaced stepping isn't present for ARM without this patch
> anyway.

What sort of tests are fluctuating?  Our internal tree has at least
one testsuite reliability fix that I hadn't gotten round to posting
for the FSF tree yet, I'll do it now.

> +static struct displaced_step_closure *
> +arm_catch_kernel_helper_return (CORE_ADDR from, CORE_ADDR to,
> +				struct regcache *regs)
> +{
> +  struct displaced_step_closure *dsc
> +    = xmalloc (sizeof (struct displaced_step_closure));
> +
> +  dsc->numinsns = 1;
> +  dsc->insn_addr = from;
> +  dsc->cleanup = &cleanup_kernel_helper_return;
> +  /* Say we wrote to the PC, else cleanup will set PC to the next
> +     instruction in the helper, which isn't helpful.  */
> +  dsc->wrote_to_pc = 1;
> +
> +  /* Preparation: tmp[0] <- r14
> +                  r14 <- <scratch space>+4
> +		  *(<scratch space>+8) <- from
> +     Insn: ldr pc, [r14, #4]
> +     Cleanup: r14 <- tmp[0], pc <- tmp[0].  */
> +
> +  dsc->tmp[0] = displaced_read_reg (regs, from, ARM_LR_REGNUM);
> +  displaced_write_reg (regs, dsc, ARM_LR_REGNUM, (ULONGEST) to + 4,
> +		       CANNOT_WRITE_PC);
> +  write_memory_unsigned_integer (to + 8, 4, from);
> +
> +  dsc->modinsn[0] = 0xe59ef004;  /* ldr pc, [lr, #4].  */
> +
> +  return dsc;
> +}

You're pointing lr at scratch+4, which will be a breakpoint.  Why do
we need the load?  Is it because common code wants to resume execution
at the scratch space, rather than at the helper?

> +   2. The instruction is single-stepped.

This misses the complicated part, IMO :-)  Execution is resumed at the
scratch space address.  Thus the breakpoint.

> +/* Write to the PC as from a branch-exchange instruction.  */
> +
> +static void
> +bx_write_pc (struct regcache *regs, ULONGEST val)
> +{
> +  ULONGEST ps;
> +
> +  regcache_cooked_read_unsigned (regs, ARM_PS_REGNUM, &ps);
> +
> +  if ((val & 1) == 1)
> +    {
> +      regcache_cooked_write_unsigned (regs, ARM_PS_REGNUM, ps | CPSR_T);
> +      regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val & 0xfffffffe);
> +    }
> +  else if ((val & 2) == 0)
> +    {
> +      regcache_cooked_write_unsigned (regs, ARM_PS_REGNUM,
> +				      ps & ~(ULONGEST) CPSR_T);
> +      regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val);
> +    }
> +  else
> +    /* Unpredictable behaviour.  */
> +    warning (_("Single-stepping BX to non-word-aligned ARM instruction."));
> +}

Let's either make this an error, or else do our best - the current
version will fall through to the instruction after the BX.

> +/* This function is used to concisely determine if an instruction INSN
> +   references PC.  Register fields of interest in INSN should have the
> +   corresponding fields of BITMASK set to 0b1111.  The function returns return 1
> +   if any of these fields in INSN reference the PC (also 0b1111, r15), else it
> +   returns 0.  */
> +
> +static int
> +insn_references_pc (unsigned long insn, unsigned long bitmask)
> +{
> +  unsigned long lowbit = 1;

Should these be uint32_t instead of unsigned long?

Otherwise, looks OK.

-- 
Daniel Jacobowitz
CodeSourcery
Follow-Ups:
- Re: [PATCH] Displaced stepping (non-stop debugging) support for ARM Linux
  - From: Pedro Alves
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]