This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Displaced stepping (non-stop debugging) support for ARM Linux


Hi,

This is a new version of the patch to support displaced stepping on
ARM. Many things are fixed from the last version posted previously
(January 20th), though we're probably not 100% of the way there yet.

Pedro Alves wrote:

> Right, you may end up with a temporary breakpoint over another
> breakpoint, though.  It would be better to use the standard software
> single-stepping (set temp break at next pc, continue, remove break)
> for standard stepping requests, and use displaced stepping only for
> stepping over breakpoints.  Unfortunately, you don't get that for
> free --- infrun.c and friends don't know how to handle multiple
> simultaneous software single-stepping requests, and that is required
> in non-stop mode.

I'm not sure what the status is here now. For testing purposes, I've
(still) been using a local patch which uses displaced stepping for all
single-step operations.

Daniel Jacobowitz <drow@false.org> wrote:

> * What's the point of executing mov<cond> on the target for BL<cond>?
> At that point it seems like we ought to skip the target step entirely;
> just simulate the instruction.  We've already got a function to check
> conditions (condition_true).

I'm now using NOP instructions and condition_true, because the current
displaced stepping support wants to execute "something" rather than
nothing.

> * Using arm_write_pc is a bit dodgy here; I don't think it's what we
> want.  That function updates the CPSR based on a number of things
> including symbol tables.  We know exactly what is supposed to happen
> to CPSR for a given instruction and should honor it.  An example of
> why this matters: people regularly get a blx in Cortex-M3 code by use
> of bad libraries, untyped ELF symbols, or other such circumstances.
> That blx had better update the CPSR even when we step over it.

Fixed, I think.

> > +/* FIXME: This should depend on the arch version.  */
> > +
> > +static ULONGEST
> > +modify_store_pc (ULONGEST pc)
> > +{
> > +  return pc + 4;
> > +}
> 
> This one we might not be able to fix in current GDB but we can at
> least expand the comment... if I remember right the +4 is correct for
> everything since ARMv5 and most ARMv4?

I've removed this function. Stores of PC now read back the offset, so
should be architecture-version independent (the strategy is slightly
different for STR vs. STM: see below).

> Yes, we just can't emulate loads or stores.  Anything that could cause
> an exception that won't be delayed till the next instruction, I think.

LDM and STM are handled substantially differently now: STM instructions
are let through unmodified, and when PC is in the register list the
cleanup routine reads back the stored value and calculates the proper
offset for PC writes. The true (non-displaced) PC value (plus offset) is
then written to the appropriate memory location.

LDM instructions shuffle registers downwards into a contiguous list (to
avoid loading PC directly), then fix up register contents afterwards in
the cleanup routine. The case with a fully-populated register list is
still emulated, for now.

> > +static int
> > +copy_svc (unsigned long insn, CORE_ADDR to, struct regcache *regs,
> > +	  struct displaced_step_closure *dsc)
> > +{
> > +  CORE_ADDR from = dsc->insn_addr;
> > +
> > +  if (debug_displaced)
> > +    fprintf_unfiltered (gdb_stdlog, "displaced: copying svc insn
> > %.8lx\n",
> > +			insn);
> > +
> > +  /* Preparation: tmp[0] <- to.
> > +     Insn: unmodified svc.
> > +     Cleanup: if (pc == <scratch>+4) pc <- insn_addr + 4;
> > +	      else leave PC alone.  */
> 
> What about the saved PC?  Don't really want the OS service routine to
> return to the scratchpad.
> 
> > +  /* FIXME: What can we do about signal trampolines?  */
> 
> Maybe this is referring to the same question I asked above?
> 
> If so, I think you get to unwind and if you find the scratchpad,
> update the saved PC.

I've tried to figure this out, and have totally drawn a blank so far.
AFAICT, the problem we're trying to solve runs as follows: sometimes, a
signal may be delivered to a process whilst it is executing a system
call. In that case, the kernel writes a signal trampoline to the user
program's stack space, and rewrites the state so that the trampoline is
executed when the system call returns.

Now: if we single-step that signal trampoline, we will see a system
call ("sigreturn") which does not return to the caller: rather, it
returns to a handler (in the user program) for the signal in question.
So, the expected result at present is that if displaced stepping is
used to single-step the sigreturn call, the debugger will lose control
of the debugged program.

Unfortunately I've been unable to figure out if the above is true, and
I can't quite figure out the mechanism in enough detail to know if
there's really anything we can do about it if so. My test program
(stolen from the internet and tweaked) runs as follows:

/*
 * signal.c - A signal-catching test program
 */
#include <stdio.h>
#include <unistd.h>
#include <signal.h>

void func (int, siginfo_t *, void *);
void func2 (int, siginfo_t *, void *);

int main (int argc, char **argv) {
  struct sigaction sa;

  printf ("Starting execution\n");
  sa.sa_sigaction = func;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO | SA_RESETHAND;
  if (sigaction (SIGHUP, &sa, NULL))
   perror ("sigaction() failed");
  sa.sa_sigaction = func2;
  if (sigaction (SIGINT, &sa, NULL))
   perror ("sigaction() failed");
  printf ("sigaction() successful. Now sleeping\n");
  while (1)
   sleep (600);
  printf ("I should not come here\n");
  return 0;
}

void
func (int sig, siginfo_t *sinf, void *foo)
{
  printf ("Signal Handler: sig=%d scp=%p\n", sig, sinf);
  if (sinf)
    {
      printf ("siginfo.si_signo=%d\n", sinf->si_signo);
      printf ("siginfo.si_errno=%d\n", sinf->si_errno);
      printf ("siginfo.si_code=%d\n", sinf->si_code);
    }
  pause ();
  printf ("func() exiting\n");
  sleep (2);
}

void
func2 (int sig, siginfo_t *sinf, void *foo)
{
  printf ("Signal Handler: sig=%d scp=%p\n", sig, sinf);
  if (sinf)
    {
      printf ("siginfo.si_signo=%d\n", sinf->si_signo);
      printf ("siginfo.si_errno=%d\n", sinf->si_errno);
      printf ("siginfo.si_code=%d\n", sinf->si_code);
    }
  printf ("func2() exiting\n");
}

Without the debugger, this can be run, then sent signal 1 (which prints
the messages from func(), and then sent signal 2 (which prints the
messages from func2() -- presumably after running a signal trampoline,
though I'm not entirely certain of that), then sleeps. But with the
debugger, the program never gets beyond func(): and that's where I got
stuck.

> > +struct displaced_step_closure *
> > +arm_displaced_step_copy_insn (struct gdbarch *gdbarch,
> > +			      CORE_ADDR from, CORE_ADDR to,
> > +			      struct regcache *regs)
> > +{
> > +  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
> > +  const size_t len = 4;
> > +  gdb_byte *buf = xmalloc (len);
> > +  struct displaced_step_closure *dsc;
> > +  unsigned long insn;
> > +  int i;
> > +
> > +  /* A linux-specific hack.  Detect when we've entered
> > (inaccessible by GDB)
> > +     kernel helpers, and stop at the return location.  */
> > +  if (gdbarch_osabi (gdbarch) == GDB_OSABI_LINUX && from >
> > 0xffff0000)
> > +    {
> > +      if (debug_displaced)
> > +        fprintf_unfiltered (gdb_stdlog, "displaced: detected
> > kernel helper "
> > +			    "at %.8lx\n", (unsigned long) from);
> > +
> > +      dsc = arm_catch_kernel_helper_return (from, to, regs);
> > +    }
> > +  else
> > +    {
> > +      insn = read_memory_unsigned_integer (from, len);
> > +
> > +      if (debug_displaced)
> > +	fprintf_unfiltered (gdb_stdlog, "displaced: stepping insn
> > %.8lx "
> > +			    "at %.8lx\n", insn, (unsigned long)
> > from); +
> > +      dsc = arm_process_displaced_insn (insn, from, to, regs);
> > +    }
> 
> Can the Linux-specific hack go in arm-linux-tdep.c?  Shouldn't have to
> make many functions global to do that.

Moved. Other points you (Dan) raised have been dealt with, I think.

I've hit some problems testing this patch, mainly because I can't seem
to get a reliable baseline run with my current test setup. AFAICT, there
should be no affect on behaviour unless displaced stepping is in use
(differences in passes/failures with my patch only seem to be in
"unreliable" tests, after running baseline testing three times), and of
course displaced stepping isn't present for ARM without this patch
anyway.

OK to apply?

Thanks,

Julian

ChangeLog

    gdb/
    * arm-linux-tdep.c (arch-utils.h, inferior.h): Include files.
    (cleanup_kernel_helper_return, arm_catch_kernel_helper_return): New.
    (arm_linux_displaced_step_copy_insn): New.
    (arm_linux_init_abi): Initialise displaced stepping callbacks.
    * arm-tdep.c (DISPLACED_STEPPING_ARCH_VERSION): New macro.
    (ARM_NOP): New.
    (displaced_read_reg, displaced_in_arm_mode, branch_write_pc)
    (bx_write_pc, load_write_pc, alu_write_pc, displaced_write_reg)
    (insn_references_pc, copy_unmodified, cleanup_preload, copy_preload)
    (copy_preload_reg, cleanup_copro_load_store, copy_copro_load_store)
    (cleanup_branch, copy_b_bl_blx, copy_bx_blx_reg, cleanup_alu_imm)
    (copy_alu_imm, cleanup_alu_reg, copy_alu_reg)
    (cleanup_alu_shifted_reg, copy_alu_shifted_reg, cleanup_load)
    (cleanup_store, copy_extra_ld_st, copy_ldr_str_ldrb_strb)
    (cleanup_block_load_all, cleanup_block_store_pc)
    (cleanup_block_load_pc, copy_block_xfer, cleanup_svc, copy_svc)
    (copy_undef, copy_unpred): New.
    (decode_misc_memhint_neon, decode_unconditional)
    (decode_miscellaneous, decode_dp_misc, decode_ld_st_word_ubyte)
    (decode_media, decode_b_bl_ldmstm, decode_ext_reg_ld_st)
    (decode_svc_copro, arm_process_displaced_insn)
    (arm_displaced_init_closure, arm_displcaed_step_copy_insn)
    (arm_displaced_step_fixup): New.
    (arm_gdbarch_init): Initialise max insn length field.
    * arm-tdep.h (DISPLACED_TEMPS, DISPLACED_MODIFIED_INSNS): New
    macros.
    (displaced_step_closure, pc_write_style): New.
    (arm_displaced_init_closure, displaced_read_reg)
    (displaced_write_reg, arm_displaced_step_copy_insn)
    (arm_displaced_step_fixup): Add prototypes.
--- .pc/displaced-stepping/gdb/arm-linux-tdep.c	2009-05-15 16:05:07.000000000 -0700
+++ gdb/arm-linux-tdep.c	2009-05-16 10:16:52.000000000 -0700
@@ -38,6 +38,8 @@
 #include "arm-linux-tdep.h"
 #include "linux-tdep.h"
 #include "glibc-tdep.h"
+#include "arch-utils.h"
+#include "inferior.h"
 
 #include "gdb_string.h"
 
@@ -590,6 +592,77 @@ arm_linux_software_single_step (struct f
   return 1;
 }
 
+/* The following two functions implement single-stepping over calls to Linux
+   kernel helper routines, which perform e.g. atomic operations on architecture
+   variants which don't support them natively.  We call the helper out-of-line
+   and place a breakpoint at the return address (in our scratch space).  */
+
+static void
+cleanup_kernel_helper_return (struct regcache *regs,
+			      struct displaced_step_closure *dsc)
+{
+  displaced_write_reg (regs, dsc, ARM_LR_REGNUM, dsc->tmp[0], CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, ARM_PC_REGNUM, dsc->tmp[0], BRANCH_WRITE_PC);
+}
+
+static struct displaced_step_closure *
+arm_catch_kernel_helper_return (CORE_ADDR from, CORE_ADDR to,
+				struct regcache *regs)
+{
+  struct displaced_step_closure *dsc
+    = xmalloc (sizeof (struct displaced_step_closure));
+
+  dsc->numinsns = 1;
+  dsc->insn_addr = from;
+  dsc->cleanup = &cleanup_kernel_helper_return;
+  /* Say we wrote to the PC, else cleanup will set PC to the next
+     instruction in the helper, which isn't helpful.  */
+  dsc->wrote_to_pc = 1;
+
+  /* Preparation: tmp[0] <- r14
+                  r14 <- <scratch space>+4
+		  *(<scratch space>+8) <- from
+     Insn: ldr pc, [r14, #4]
+     Cleanup: r14 <- tmp[0], pc <- tmp[0].  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, ARM_LR_REGNUM);
+  displaced_write_reg (regs, dsc, ARM_LR_REGNUM, (ULONGEST) to + 4,
+		       CANNOT_WRITE_PC);
+  write_memory_unsigned_integer (to + 8, 4, from);
+
+  dsc->modinsn[0] = 0xe59ef004;  /* ldr pc, [lr, #4].  */
+
+  return dsc;
+}
+
+/* Linux-specific displaced step instruction copying function.  Detects when
+   the program has stepped into a Linux kernel helper routine (which must be
+   handled as a special case), falling back to arm_displaced_step_copy_insn()
+   if it hasn't.  */
+
+static struct displaced_step_closure *
+arm_linux_displaced_step_copy_insn (struct gdbarch *gdbarch,
+				    CORE_ADDR from, CORE_ADDR to,
+				    struct regcache *regs)
+{
+  /* Detect when we enter an (inaccessible by GDB) Linux kernel helper, and
+     stop at the return location.  */
+  if (from > 0xffff0000)
+    {
+      struct displaced_step_closure *dsc;
+
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: detected kernel helper "
+			    "at %.8lx\n", (unsigned long) from);
+
+      dsc = arm_catch_kernel_helper_return (from, to, regs);
+
+      return arm_displaced_init_closure (gdbarch, from, to, dsc);
+    }
+  else
+    return arm_displaced_step_copy_insn (gdbarch, from, to, regs);
+}
+
 static void
 arm_linux_init_abi (struct gdbarch_info info,
 		    struct gdbarch *gdbarch)
@@ -650,6 +723,14 @@ arm_linux_init_abi (struct gdbarch_info 
 					arm_linux_regset_from_core_section);
 
   set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type);
+
+  /* Displaced stepping.  */
+  set_gdbarch_displaced_step_copy_insn (gdbarch,
+					arm_linux_displaced_step_copy_insn);
+  set_gdbarch_displaced_step_fixup (gdbarch, arm_displaced_step_fixup);
+  set_gdbarch_displaced_step_free_closure (gdbarch,
+					   simple_displaced_step_free_closure);
+  set_gdbarch_displaced_step_location (gdbarch, displaced_step_at_entry_point);
 }
 
 /* Provide a prototype to silence -Wmissing-prototypes.  */
--- .pc/displaced-stepping/gdb/arm-tdep.c	2009-05-15 16:05:07.000000000 -0700
+++ gdb/arm-tdep.c	2009-05-16 10:16:52.000000000 -0700
@@ -241,6 +241,11 @@ struct arm_prologue_cache
   struct trad_frame_saved_reg *saved_regs;
 };
 
+/* Architecture version for displaced stepping.  This effects the behaviour of
+   certain instructions, and really should not be hard-wired.  */
+
+#define DISPLACED_STEPPING_ARCH_VERSION		5
+
 /* Addresses for calling Thumb functions have the bit 0 set.
    Here are some macros to test, set, or clear bit 0 of addresses.  */
 #define IS_THUMB_ADDR(addr)	((addr) & 1)
@@ -2175,6 +2180,1828 @@ arm_software_single_step (struct frame_i
   return 1;
 }
 
+/* ARM displaced stepping support.
+
+   Generally ARM displaced stepping works as follows:
+   
+   1. When an instruction is to be single-stepped, it is first decoded by
+      arm_process_displaced_insn (called from arm_displaced_step_copy_insn).
+      Depending on the type of instruction, it is then copied to a scratch
+      location, possibly in a modified form.  The copy_* set of functions
+      performs such modification, as necessary. A breakpoint is placed after
+      the modified instruction in the scratch space to return control to GDB.
+      Note in particular that instructions which modify the PC will no longer
+      do so after modification.
+
+   2. The instruction is single-stepped.
+   
+   3. A cleanup function (cleanup_*) is called corresponding to the copy_*
+      function used for the current instruction.  This function's job is to
+      put the CPU/memory state back to what it would have been if the
+      instruction had been executed unmodified in its original location.  */
+
+/* NOP instruction (mov r0, r0).  */
+#define ARM_NOP				0xe1a00000
+
+/* Helper for register reads for displaced stepping.  In particular, this
+   returns the PC as it would be seen by the instruction at its original
+   location.  */
+
+ULONGEST
+displaced_read_reg (struct regcache *regs, CORE_ADDR from, int regno)
+{
+  ULONGEST ret;
+
+  if (regno == 15)
+    {
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: read pc value %.8lx\n",
+			    (unsigned long) from + 8);
+      return (ULONGEST) from + 8;  /* Pipeline offset.  */
+    }
+  else
+    {
+      regcache_cooked_read_unsigned (regs, regno, &ret);
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: read r%d value %.8lx\n",
+			    regno, (unsigned long) ret);
+      return ret;
+    }
+}
+
+static int
+displaced_in_arm_mode (struct regcache *regs)
+{
+  ULONGEST ps;
+
+  regcache_cooked_read_unsigned (regs, ARM_PS_REGNUM, &ps);
+
+  return (ps & CPSR_T) == 0;
+}
+
+/* Write to the PC as from a branch instruction.  */
+
+static void
+branch_write_pc (struct regcache *regs, ULONGEST val)
+{
+  if (displaced_in_arm_mode (regs))
+    /* Note: If bits 0/1 are set, this branch would be unpredictable for
+       architecture versions < 6.  */
+    regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val & ~(ULONGEST) 0x3);
+  else
+    regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val & ~(ULONGEST) 0x1);
+}
+
+/* Write to the PC as from a branch-exchange instruction.  */
+
+static void
+bx_write_pc (struct regcache *regs, ULONGEST val)
+{
+  ULONGEST ps;
+
+  regcache_cooked_read_unsigned (regs, ARM_PS_REGNUM, &ps);
+
+  if ((val & 1) == 1)
+    {
+      regcache_cooked_write_unsigned (regs, ARM_PS_REGNUM, ps | CPSR_T);
+      regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val & 0xfffffffe);
+    }
+  else if ((val & 2) == 0)
+    {
+      regcache_cooked_write_unsigned (regs, ARM_PS_REGNUM,
+				      ps & ~(ULONGEST) CPSR_T);
+      regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, val);
+    }
+  else
+    /* Unpredictable behaviour.  */
+    warning (_("Single-stepping BX to non-word-aligned ARM instruction."));
+}
+
+/* Write to the PC as if from a load instruction.  */
+
+static void
+load_write_pc (struct regcache *regs, ULONGEST val)
+{
+  if (DISPLACED_STEPPING_ARCH_VERSION >= 5)
+    bx_write_pc (regs, val);
+  else
+    branch_write_pc (regs, val);
+}
+
+/* Write to the PC as if from an ALU instruction.  */
+
+static void
+alu_write_pc (struct regcache *regs, ULONGEST val)
+{
+  if (DISPLACED_STEPPING_ARCH_VERSION >= 7 && displaced_in_arm_mode (regs))
+    bx_write_pc (regs, val);
+  else
+    branch_write_pc (regs, val);
+}
+
+/* Helper for writing to registers for displaced stepping.  Writing to the PC
+   has a varying effects depending on the instruction which does the write:
+   this is controlled by the WRITE_PC argument.  */
+
+void
+displaced_write_reg (struct regcache *regs, struct displaced_step_closure *dsc,
+		     int regno, ULONGEST val, enum pc_write_style write_pc)
+{
+  if (regno == 15)
+    {
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: writing pc %.8lx\n",
+			    (unsigned long) val);
+      switch (write_pc)
+        {
+	case BRANCH_WRITE_PC:
+	  branch_write_pc (regs, val);
+	  break;
+
+	case BX_WRITE_PC:
+	  bx_write_pc (regs, val);
+	  break;
+
+	case LOAD_WRITE_PC:
+	  load_write_pc (regs, val);
+	  break;
+
+	case ALU_WRITE_PC:
+	  alu_write_pc (regs, val);
+	  break;
+
+	case CANNOT_WRITE_PC:
+	  warning (_("Instruction wrote to PC in an unexpected way when "
+		     "single-stepping"));
+	  break;
+
+	default:
+	  abort ();
+	}
+
+      dsc->wrote_to_pc = 1;
+    }
+  else
+    {
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: writing r%d value %.8lx\n",
+			    regno, (unsigned long) val);
+      regcache_cooked_write_unsigned (regs, regno, val);
+    }
+}
+
+/* This function is used to concisely determine if an instruction INSN
+   references PC.  Register fields of interest in INSN should have the
+   corresponding fields of BITMASK set to 0b1111.  The function returns return 1
+   if any of these fields in INSN reference the PC (also 0b1111, r15), else it
+   returns 0.  */
+
+static int
+insn_references_pc (unsigned long insn, unsigned long bitmask)
+{
+  unsigned long lowbit = 1;
+
+  while (bitmask != 0)
+    {
+      unsigned long mask;
+
+      for (; lowbit && (bitmask & lowbit) == 0; lowbit <<= 1)
+        ;
+
+      if (!lowbit)
+        break;
+
+      mask = lowbit * 0xf;
+
+      if ((insn & mask) == mask)
+        return 1;
+
+      bitmask &= ~mask;
+    }
+
+  return 0;
+}
+
+/* The simplest copy function.  Many instructions have the same effect no
+   matter what address they are executed at: in those cases, use this.  */
+
+static int
+copy_unmodified (unsigned long insn, const char *iname,
+		 struct displaced_step_closure *dsc)
+{
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying insn %.8lx, "
+			"opcode/class '%s' unmodified\n", insn, iname);
+
+  dsc->modinsn[0] = insn;
+
+  return 0;
+}
+
+/* Preload instructions with immediate offset.  */
+
+static void
+cleanup_preload (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  displaced_write_reg (regs, dsc, 0, dsc->tmp[0], CANNOT_WRITE_PC);
+  if (!dsc->u.preload.immed)
+    displaced_write_reg (regs, dsc, 1, dsc->tmp[1], CANNOT_WRITE_PC);
+}
+
+static int
+copy_preload (unsigned long insn, struct regcache *regs,
+	      struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  ULONGEST rn_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000f0000ul))
+    return copy_unmodified (insn, "preload", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying preload insn %.8lx\n",
+			insn);
+
+  /* Preload instructions:
+
+     {pli/pld} [rn, #+/-imm]
+     ->
+     {pli/pld} [r0, #+/-imm].  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  rn_val = displaced_read_reg (regs, from, rn);
+  displaced_write_reg (regs, dsc, 0, rn_val, CANNOT_WRITE_PC);
+
+  dsc->u.preload.immed = 1;
+
+  dsc->modinsn[0] = insn & 0xfff0ffff;
+
+  dsc->cleanup = &cleanup_preload;
+
+  return 0;
+}
+
+/* Preload instructions with register offset.  */
+
+static int
+copy_preload_reg (unsigned long insn, struct regcache *regs,
+		  struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rm = bits (insn, 0, 3);
+  ULONGEST rn_val, rm_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000f000ful))
+    return copy_unmodified (insn, "preload reg", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying preload insn %.8lx\n",
+			insn);
+
+  /* Preload register-offset instructions:
+
+     {pli/pld} [rn, rm {, shift}]
+     ->
+     {pli/pld} [r0, r1 {, shift}].  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  dsc->tmp[1] = displaced_read_reg (regs, from, 1);
+  rn_val = displaced_read_reg (regs, from, rn);
+  rm_val = displaced_read_reg (regs, from, rm);
+  displaced_write_reg (regs, dsc, 0, rn_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 1, rm_val, CANNOT_WRITE_PC);
+
+  dsc->u.preload.immed = 0;
+
+  dsc->modinsn[0] = (insn & 0xfff0fff0) | 0x1;
+
+  dsc->cleanup = &cleanup_preload;
+
+  return 0;
+}
+
+/* Copy/cleanup coprocessor load and store instructions.  */
+
+static void
+cleanup_copro_load_store (struct regcache *regs,
+			  struct displaced_step_closure *dsc)
+{
+  ULONGEST rn_val = displaced_read_reg (regs, dsc->insn_addr, 0);
+
+  displaced_write_reg (regs, dsc, 0, dsc->tmp[0], CANNOT_WRITE_PC);
+
+  if (dsc->u.ldst.writeback)
+    displaced_write_reg (regs, dsc, dsc->u.ldst.rn, rn_val, LOAD_WRITE_PC);
+}
+
+static int
+copy_copro_load_store (unsigned long insn, struct regcache *regs,
+		       struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  ULONGEST rn_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000f0000ul))
+    return copy_unmodified (insn, "copro load/store", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying coprocessor "
+			"load/store insn %.8lx\n", insn);
+
+  /* Coprocessor load/store instructions:
+
+     {stc/stc2} [<Rn>, #+/-imm]  (and other immediate addressing modes)
+     ->
+     {stc/stc2} [r0, #+/-imm].
+
+     ldc/ldc2 are handled identically.  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  rn_val = displaced_read_reg (regs, from, rn);
+  displaced_write_reg (regs, dsc, 0, rn_val, CANNOT_WRITE_PC);
+
+  dsc->u.ldst.writeback = bit (insn, 25);
+  dsc->u.ldst.rn = rn;
+
+  dsc->modinsn[0] = insn & 0xfff0ffff;
+
+  dsc->cleanup = &cleanup_copro_load_store;
+
+  return 0;
+}
+
+/* Clean up branch instructions (actually perform the branch, by setting
+   PC).  */
+
+static void
+cleanup_branch (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  ULONGEST from = dsc->insn_addr;
+  unsigned long status = displaced_read_reg (regs, from, ARM_PS_REGNUM);
+  int branch_taken = condition_true (dsc->u.branch.cond, status);
+  enum pc_write_style write_pc = dsc->u.branch.exchange
+				 ? BX_WRITE_PC : BRANCH_WRITE_PC;
+
+  if (!branch_taken)
+    return;
+
+  if (dsc->u.branch.link)
+    {
+      ULONGEST pc = displaced_read_reg (regs, from, 15);
+      displaced_write_reg (regs, dsc, 14, pc - 4, CANNOT_WRITE_PC);
+    }
+
+  displaced_write_reg (regs, dsc, 15, dsc->u.branch.dest, write_pc);
+}
+
+/* Copy B/BL/BLX instructions with immediate destinations.  */
+
+static int
+copy_b_bl_blx (unsigned long insn, struct regcache *regs,
+	       struct displaced_step_closure *dsc)
+{
+  unsigned int cond = bits (insn, 28, 31);
+  int exchange = (cond == 0xf);
+  int link = exchange || bit (insn, 24);
+  CORE_ADDR from = dsc->insn_addr;
+  long offset;
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying %s immediate insn "
+			"%.8lx\n", (exchange) ? "blx" : (link) ? "bl" : "b",
+			insn);
+
+  /* Implement "BL<cond> <label>" as:
+
+     Preparation: cond <- instruction condition
+     Insn: mov r0, r0  (nop)
+     Cleanup: if (condition true) { r14 <- pc; pc <- label }.
+
+     B<cond> similar, but don't set r14 in cleanup.  */
+
+  if (exchange)
+    /* For BLX, set bit 0 of the destination.  The cleanup_branch function will
+       then arrange the switch into Thumb mode.  */
+    offset = (bits (insn, 0, 23) << 2) | (bit (insn, 24) << 1) | 1;
+  else
+    offset = bits (insn, 0, 23) << 2;
+
+  if (bit (offset, 25))
+    offset = offset | ~0x3ffffff;
+
+  dsc->u.branch.cond = cond;
+  dsc->u.branch.link = link;
+  dsc->u.branch.exchange = exchange;
+  dsc->u.branch.dest = from + 8 + offset;
+
+  dsc->modinsn[0] = ARM_NOP;
+
+  dsc->cleanup = &cleanup_branch;
+
+  return 0;
+}
+
+/* Copy BX/BLX with register-specified destinations.  */
+
+static int
+copy_bx_blx_reg (unsigned long insn, struct regcache *regs,
+		 struct displaced_step_closure *dsc)
+{
+  unsigned int cond = bits (insn, 28, 31);
+  /* BX:  x12xxx1x
+     BLX: x12xxx3x.  */
+  int link = bit (insn, 5);
+  unsigned int rm = bits (insn, 0, 3);
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying %s register insn "
+			"%.8lx\n", (link) ? "blx" : "bx", insn);
+
+  /* Implement {BX,BLX}<cond> <reg>" as:
+
+     Preparation: cond <- instruction condition
+     Insn: mov r0, r0 (nop)
+     Cleanup: if (condition true) { r14 <- pc; pc <- dest; }.
+
+     Don't set r14 in cleanup for BX.  */
+
+  dsc->u.branch.dest = displaced_read_reg (regs, from, rm);
+
+  dsc->u.branch.cond = cond;
+  dsc->u.branch.link = link;
+  dsc->u.branch.exchange = 1;
+
+  dsc->modinsn[0] = ARM_NOP;
+
+  dsc->cleanup = &cleanup_branch;
+
+  return 0;
+}
+
+/* Copy/cleanup arithmetic/logic instruction with immediate RHS. */
+
+static void
+cleanup_alu_imm (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  ULONGEST rd_val = displaced_read_reg (regs, dsc->insn_addr, 0);
+  displaced_write_reg (regs, dsc, 0, dsc->tmp[0], CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 1, dsc->tmp[1], CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, dsc->rd, rd_val, ALU_WRITE_PC);
+}
+
+static int
+copy_alu_imm (unsigned long insn, struct regcache *regs,
+	     struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rd = bits (insn, 12, 15);
+  unsigned int op = bits (insn, 21, 24);
+  int is_mov = (op == 0xd);
+  ULONGEST rd_val, rn_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000ff000ul))
+    return copy_unmodified (insn, "ALU immediate", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying immediate %s insn "
+			"%.8lx\n", is_mov ? "move" : "ALU", insn);
+
+  /* Instruction is of form:
+
+     <op><cond> rd, [rn,] #imm
+
+     Rewrite as:
+
+     Preparation: tmp1, tmp2 <- r0, r1;
+		  r0, r1 <- rd, rn
+     Insn: <op><cond> r0, r1, #imm
+     Cleanup: rd <- r0; r0 <- tmp1; r1 <- tmp2
+  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  dsc->tmp[1] = displaced_read_reg (regs, from, 1);
+  rn_val = displaced_read_reg (regs, from, rn);
+  rd_val = displaced_read_reg (regs, from, rd);
+  displaced_write_reg (regs, dsc, 0, rd_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 1, rn_val, CANNOT_WRITE_PC);
+  dsc->rd = rd;
+
+  if (is_mov)
+    dsc->modinsn[0] = insn & 0xfff00fff;
+  else
+    dsc->modinsn[0] = (insn & 0xfff00fff) | 0x10000;
+
+  dsc->cleanup = &cleanup_alu_imm;
+
+  return 0;
+}
+
+/* Copy/cleanup arithmetic/logic insns with register RHS.  */
+
+static void
+cleanup_alu_reg (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  ULONGEST rd_val;
+  int i;
+
+  rd_val = displaced_read_reg (regs, dsc->insn_addr, 0);
+
+  for (i = 0; i < 3; i++)
+    displaced_write_reg (regs, dsc, i, dsc->tmp[i], CANNOT_WRITE_PC);
+
+  displaced_write_reg (regs, dsc, dsc->rd, rd_val, ALU_WRITE_PC);
+}
+
+static int
+copy_alu_reg (unsigned long insn, struct regcache *regs,
+	     struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rm = bits (insn, 0, 3);
+  unsigned int rd = bits (insn, 12, 15);
+  unsigned int op = bits (insn, 21, 24);
+  int is_mov = (op == 0xd);
+  ULONGEST rd_val, rn_val, rm_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000ff00ful))
+    return copy_unmodified (insn, "ALU reg", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying reg %s insn %.8lx\n",
+			is_mov ? "move" : "ALU", insn);
+
+  /* Instruction is of form:
+
+     <op><cond> rd, [rn,] rm [, <shift>]
+
+     Rewrite as:
+
+     Preparation: tmp1, tmp2, tmp3 <- r0, r1, r2;
+		  r0, r1, r2 <- rd, rn, rm
+     Insn: <op><cond> r0, r1, r2 [, <shift>]
+     Cleanup: rd <- r0; r0, r1, r2 <- tmp1, tmp2, tmp3
+  */
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  dsc->tmp[1] = displaced_read_reg (regs, from, 1);
+  dsc->tmp[2] = displaced_read_reg (regs, from, 2);
+  rd_val = displaced_read_reg (regs, from, rd);
+  rn_val = displaced_read_reg (regs, from, rn);
+  rm_val = displaced_read_reg (regs, from, rm);
+  displaced_write_reg (regs, dsc, 0, rd_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 1, rn_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, rm_val, CANNOT_WRITE_PC);
+  dsc->rd = rd;
+
+  if (is_mov)
+    dsc->modinsn[0] = (insn & 0xfff00ff0) | 0x2;
+  else
+    dsc->modinsn[0] = (insn & 0xfff00ff0) | 0x10002;
+
+  dsc->cleanup = &cleanup_alu_reg;
+
+  return 0;
+}
+
+/* Cleanup/copy arithmetic/logic insns with shifted register RHS.  */
+
+static void
+cleanup_alu_shifted_reg (struct regcache *regs,
+			struct displaced_step_closure *dsc)
+{
+  ULONGEST rd_val = displaced_read_reg (regs, dsc->insn_addr, 0);
+  int i;
+
+  for (i = 0; i < 4; i++)
+    displaced_write_reg (regs, dsc, i, dsc->tmp[i], CANNOT_WRITE_PC);
+
+  displaced_write_reg (regs, dsc, dsc->rd, rd_val, ALU_WRITE_PC);
+}
+
+static int
+copy_alu_shifted_reg (unsigned long insn, struct regcache *regs,
+		     struct displaced_step_closure *dsc)
+{
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rm = bits (insn, 0, 3);
+  unsigned int rd = bits (insn, 12, 15);
+  unsigned int rs = bits (insn, 8, 11);
+  unsigned int op = bits (insn, 21, 24);
+  int is_mov = (op == 0xd), i;
+  ULONGEST rd_val, rn_val, rm_val, rs_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000fff0ful))
+    return copy_unmodified (insn, "ALU shifted reg", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying shifted reg %s insn "
+			"%.8lx\n", is_mov ? "move" : "ALU", insn);
+
+  /* Instruction is of form:
+
+     <op><cond> rd, [rn,] rm, <shift> rs
+
+     Rewrite as:
+
+     Preparation: tmp1, tmp2, tmp3, tmp4 <- r0, r1, r2, r3
+		  r0, r1, r2, r3 <- rd, rn, rm, rs
+     Insn: <op><cond> r0, r1, r2, <shift> r3
+     Cleanup: tmp5 <- r0
+	      r0, r1, r2, r3 <- tmp1, tmp2, tmp3, tmp4
+	      rd <- tmp5
+  */
+
+  for (i = 0; i < 4; i++)
+    dsc->tmp[i] = displaced_read_reg (regs, from, i);
+
+  rd_val = displaced_read_reg (regs, from, rd);
+  rn_val = displaced_read_reg (regs, from, rn);
+  rm_val = displaced_read_reg (regs, from, rm);
+  rs_val = displaced_read_reg (regs, from, rs);
+  displaced_write_reg (regs, dsc, 0, rd_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 1, rn_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, rm_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 3, rs_val, CANNOT_WRITE_PC);
+  dsc->rd = rd;
+
+  if (is_mov)
+    dsc->modinsn[0] = (insn & 0xfff000f0) | 0x302;
+  else
+    dsc->modinsn[0] = (insn & 0xfff000f0) | 0x10302;
+
+  dsc->cleanup = &cleanup_alu_shifted_reg;
+
+  return 0;
+}
+
+/* Clean up load instructions.  */
+
+static void
+cleanup_load (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  ULONGEST rt_val, rt_val2 = 0, rn_val;
+  CORE_ADDR from = dsc->insn_addr;
+
+  rt_val = displaced_read_reg (regs, from, 0);
+  if (dsc->u.ldst.xfersize == 8)
+    rt_val2 = displaced_read_reg (regs, from, 1);
+  rn_val = displaced_read_reg (regs, from, 2);
+
+  displaced_write_reg (regs, dsc, 0, dsc->tmp[0], CANNOT_WRITE_PC);
+  if (dsc->u.ldst.xfersize > 4)
+    displaced_write_reg (regs, dsc, 1, dsc->tmp[1], CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, dsc->tmp[2], CANNOT_WRITE_PC);
+  if (!dsc->u.ldst.immed)
+    displaced_write_reg (regs, dsc, 3, dsc->tmp[3], CANNOT_WRITE_PC);
+
+  /* Handle register writeback.  */
+  if (dsc->u.ldst.writeback)
+    displaced_write_reg (regs, dsc, dsc->u.ldst.rn, rn_val, CANNOT_WRITE_PC);
+  /* Put result in right place.  */
+  displaced_write_reg (regs, dsc, dsc->rd, rt_val, LOAD_WRITE_PC);
+  if (dsc->u.ldst.xfersize == 8)
+    displaced_write_reg (regs, dsc, dsc->rd + 1, rt_val2, LOAD_WRITE_PC);
+}
+
+/* Clean up store instructions.  */
+
+static void
+cleanup_store (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  CORE_ADDR from = dsc->insn_addr;
+  ULONGEST rn_val = displaced_read_reg (regs, from, 2);
+
+  displaced_write_reg (regs, dsc, 0, dsc->tmp[0], CANNOT_WRITE_PC);
+  if (dsc->u.ldst.xfersize > 4)
+    displaced_write_reg (regs, dsc, 1, dsc->tmp[1], CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, dsc->tmp[2], CANNOT_WRITE_PC);
+  if (!dsc->u.ldst.immed)
+    displaced_write_reg (regs, dsc, 3, dsc->tmp[3], CANNOT_WRITE_PC);
+  if (!dsc->u.ldst.restore_r4)
+    displaced_write_reg (regs, dsc, 4, dsc->tmp[4], CANNOT_WRITE_PC);
+
+  /* Writeback.  */
+  if (dsc->u.ldst.writeback)
+    displaced_write_reg (regs, dsc, dsc->u.ldst.rn, rn_val, CANNOT_WRITE_PC);
+}
+
+/* Copy "extra" load/store instructions.  These are halfword/doubleword
+   transfers, which have a different encoding to byte/word transfers.  */
+
+static int
+copy_extra_ld_st (unsigned long insn, int unpriveleged, struct regcache *regs,
+		  struct displaced_step_closure *dsc)
+{
+  unsigned int op1 = bits (insn, 20, 24);
+  unsigned int op2 = bits (insn, 5, 6);
+  unsigned int rt = bits (insn, 12, 15);
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rm = bits (insn, 0, 3);
+  char load[12]     = {0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1};
+  char bytesize[12] = {2, 2, 2, 2, 8, 1, 8, 1, 8, 2, 8, 2};
+  int immed = (op1 & 0x4) != 0;
+  int opcode;
+  ULONGEST rt_val, rt_val2 = 0, rn_val, rm_val = 0;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000ff00ful))
+    return copy_unmodified (insn, "extra load/store", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying %sextra load/store "
+			"insn %.8lx\n", unpriveleged ? "unpriveleged " : "",
+			insn);
+
+  opcode = ((op2 << 2) | (op1 & 0x1) | ((op1 & 0x4) >> 1)) - 4;
+
+  if (opcode < 0)
+    internal_error (__FILE__, __LINE__,
+		    _("copy_extra_ld_st: instruction decode error"));
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  dsc->tmp[1] = displaced_read_reg (regs, from, 1);
+  dsc->tmp[2] = displaced_read_reg (regs, from, 2);
+  if (!immed)
+    dsc->tmp[3] = displaced_read_reg (regs, from, 3);
+
+  rt_val = displaced_read_reg (regs, from, rt);
+  if (bytesize[opcode] == 8)
+    rt_val2 = displaced_read_reg (regs, from, rt + 1);
+  rn_val = displaced_read_reg (regs, from, rn);
+  if (!immed)
+    rm_val = displaced_read_reg (regs, from, rm);
+
+  displaced_write_reg (regs, dsc, 0, rt_val, CANNOT_WRITE_PC);
+  if (bytesize[opcode] == 8)
+    displaced_write_reg (regs, dsc, 1, rt_val2, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, rn_val, CANNOT_WRITE_PC);
+  if (!immed)
+    displaced_write_reg (regs, dsc, 3, rm_val, CANNOT_WRITE_PC);
+
+  dsc->rd = rt;
+  dsc->u.ldst.xfersize = bytesize[opcode];
+  dsc->u.ldst.rn = rn;
+  dsc->u.ldst.immed = immed;
+  dsc->u.ldst.writeback = bit (insn, 24) == 0 || bit (insn, 21) != 0;
+  dsc->u.ldst.restore_r4 = 0;
+
+  if (immed)
+    /* {ldr,str}<width><cond> rt, [rt2,] [rn, #imm]
+       ->
+       {ldr,str}<width><cond> r0, [r1,] [r2, #imm].  */
+    dsc->modinsn[0] = (insn & 0xfff00fff) | 0x20000;
+  else
+    /* {ldr,str}<width><cond> rt, [rt2,] [rn, +/-rm]
+       ->
+       {ldr,str}<width><cond> r0, [r1,] [r2, +/-r3].  */
+    dsc->modinsn[0] = (insn & 0xfff00ff0) | 0x20003;
+
+  dsc->cleanup = load[opcode] ? &cleanup_load : &cleanup_store;
+
+  return 0;
+}
+
+/* Copy byte/word loads and stores.  */
+
+static int
+copy_ldr_str_ldrb_strb (unsigned long insn, struct regcache *regs,
+			struct displaced_step_closure *dsc, int load, int byte,
+			int usermode)
+{
+  int immed = !bit (insn, 25);
+  unsigned int rt = bits (insn, 12, 15);
+  unsigned int rn = bits (insn, 16, 19);
+  unsigned int rm = bits (insn, 0, 3);  /* Only valid if !immed.  */
+  ULONGEST rt_val, rn_val, rm_val = 0;
+  CORE_ADDR from = dsc->insn_addr;
+
+  if (!insn_references_pc (insn, 0x000ff00ful))
+    return copy_unmodified (insn, "load/store", dsc);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying %s%s insn %.8lx\n",
+			load ? (byte ? "ldrb" : "ldr")
+			     : (byte ? "strb" : "str"), usermode ? "t" : "",
+			insn);
+
+  dsc->tmp[0] = displaced_read_reg (regs, from, 0);
+  dsc->tmp[2] = displaced_read_reg (regs, from, 2);
+  if (!immed)
+    dsc->tmp[3] = displaced_read_reg (regs, from, 3);
+  if (!load)
+    dsc->tmp[4] = displaced_read_reg (regs, from, 4);
+
+  rt_val = displaced_read_reg (regs, from, rt);
+  rn_val = displaced_read_reg (regs, from, rn);
+  if (!immed)
+    rm_val = displaced_read_reg (regs, from, rm);
+
+  displaced_write_reg (regs, dsc, 0, rt_val, CANNOT_WRITE_PC);
+  displaced_write_reg (regs, dsc, 2, rn_val, CANNOT_WRITE_PC);
+  if (!immed)
+    displaced_write_reg (regs, dsc, 3, rm_val, CANNOT_WRITE_PC);
+
+  dsc->rd = rt;
+  dsc->u.ldst.xfersize = byte ? 1 : 4;
+  dsc->u.ldst.rn = rn;
+  dsc->u.ldst.immed = immed;
+  dsc->u.ldst.writeback = bit (insn, 24) == 0 || bit (insn, 21) != 0;
+
+  /* To write PC we can do:
+
+     scratch+0:  str pc, temp  (*temp = scratch + 8 + offset)
+     scratch+4:  ldr r4, temp
+     scratch+8:  sub r4, r4, pc  (r4 = scratch + 8 + offset - scratch - 8 - 8)
+     scratch+12: add r4, r4, #8  (r4 = offset)
+     scratch+16: add r0, r0, r4
+     scratch+20: str r0, [r2, #imm] (or str r0, [r2, r3])
+     scratch+24: <temp>
+     
+     Otherwise we don't know what value to write for PC, since the offset is
+     architecture-dependent (sometimes PC+8, sometimes PC+12).  */
+
+  if (load || rt != 15)
+    {
+      dsc->u.ldst.restore_r4 = 0;
+
+      if (immed)
+	/* {ldr,str}[b]<cond> rt, [rn, #imm], etc.
+	   ->
+	   {ldr,str}[b]<cond> r0, [r2, #imm].  */
+	dsc->modinsn[0] = (insn & 0xfff00fff) | 0x20000;
+      else
+	/* {ldr,str}[b]<cond> rt, [rn, rm], etc.
+	   ->
+	   {ldr,str}[b]<cond> r0, [r2, r3].  */
+	dsc->modinsn[0] = (insn & 0xfff00ff0) | 0x20003;
+    }
+  else
+    {
+      /* We need to use r4 as scratch.  Make sure it's restored afterwards.  */
+      dsc->u.ldst.restore_r4 = 1;
+
+      dsc->modinsn[0] = 0xe58ff014;  /* str pc, [pc, #20].  */
+      dsc->modinsn[1] = 0xe59f4010;  /* ldr r4, [pc, #16].  */
+      dsc->modinsn[2] = 0xe044400f;  /* sub r4, r4, pc.  */
+      dsc->modinsn[3] = 0xe2844008;  /* add r4, r4, #8.  */
+      dsc->modinsn[4] = 0xe0800004;  /* add r0, r0, r4.  */
+      
+      /* As above.  */
+      if (immed)
+	dsc->modinsn[5] = (insn & 0xfff00fff) | 0x20000;
+      else
+	dsc->modinsn[5] = (insn & 0xfff00ff0) | 0x20003;
+
+      dsc->modinsn[6] = 0x0;  /* breakpoint location.  */
+      dsc->modinsn[7] = 0x0;  /* scratch space.  */
+
+      dsc->numinsns = 6;
+    }
+
+  dsc->cleanup = load ? &cleanup_load : &cleanup_store;
+
+  return 0;
+}
+
+/* Cleanup LDM instructions with fully-populated register list.  This is an
+   unfortunate corner case: it's impossible to implement correctly by modifying
+   the instruction.  The issue is as follows: we have an instruction,
+   
+   ldm rN, {r0-r15}
+   
+   which we must rewrite to avoid loading PC.  A possible solution would be to
+   do the load in two halves, something like (with suitable cleanup
+   afterwards):
+   
+   mov r8, rN
+   ldm[id][ab] r8!, {r0-r7}
+   str r7, <temp>
+   ldm[id][ab] r8, {r7-r14}
+   <bkpt>
+   
+   but at present there's no suitable place for <temp>, since the scratch space
+   is overwritten before the cleanup routine is called.  For now, we simply
+   emulate the instruction.  */
+
+static void
+cleanup_block_load_all (struct regcache *regs,
+			struct displaced_step_closure *dsc)
+{
+  ULONGEST from = dsc->insn_addr;
+  int inc = dsc->u.block.increment;
+  int bump_before = dsc->u.block.before ? (inc ? 4 : -4) : 0;
+  int bump_after = dsc->u.block.before ? 0 : (inc ? 4 : -4);
+  unsigned long regmask = dsc->u.block.regmask;
+  int regno = inc ? 0 : 15;
+  CORE_ADDR xfer_addr = dsc->u.block.xfer_addr;
+  int exception_return = dsc->u.block.load && dsc->u.block.user
+			 && (regmask & 0x8000) != 0;
+  unsigned long status = displaced_read_reg (regs, from, ARM_PS_REGNUM);
+  int do_transfer = condition_true (dsc->u.block.cond, status);
+
+  if (!do_transfer)
+    return;
+
+  /* If the instruction is ldm rN, {...pc}^, I don't think there's anything
+     sensible we can do here.  Complain loudly.  */
+  if (exception_return)
+    error (_("Cannot single-step exception return"));
+
+  /* We don't handle any stores here for now.  */
+  gdb_assert (dsc->u.block.load != 0);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: emulating block transfer: "
+			"%s %s %s\n", dsc->u.block.load ? "ldm" : "stm",
+			dsc->u.block.increment ? "inc" : "dec",
+			dsc->u.block.before ? "before" : "after");
+
+  while (regmask)
+    {
+      unsigned long memword;
+
+      if (inc)
+	while (regno <= 15 && (regmask & (1 << regno)) == 0)
+	  regno++;
+      else
+        while (regno >= 0 && (regmask & (1 << regno)) == 0)
+	  regno--;
+
+      xfer_addr += bump_before;
+
+      memword = read_memory_unsigned_integer (xfer_addr, 4);
+      displaced_write_reg (regs, dsc, regno, memword, LOAD_WRITE_PC);
+
+      xfer_addr += bump_after;
+
+      regmask &= ~(1 << regno);
+    }
+
+  if (dsc->u.block.writeback)
+    displaced_write_reg (regs, dsc, dsc->u.block.rn, xfer_addr,
+			 CANNOT_WRITE_PC);
+}
+
+/* Clean up an STM which included the PC in the register list.  */
+
+static void
+cleanup_block_store_pc (struct regcache *regs,
+			struct displaced_step_closure *dsc)
+{
+  ULONGEST from = dsc->insn_addr;
+  unsigned long status = displaced_read_reg (regs, from, ARM_PS_REGNUM);
+  int store_executed = condition_true (dsc->u.block.cond, status);
+  CORE_ADDR pc_stored_at, transferred_regs = bitcount (dsc->u.block.regmask);
+  CORE_ADDR stm_insn_addr;
+  unsigned long pc_val;
+  long offset;
+
+  /* If condition code fails, there's nothing else to do.  */
+  if (!store_executed)
+    return;
+
+  if (dsc->u.block.increment)
+    {
+      pc_stored_at = dsc->u.block.xfer_addr + 4 * transferred_regs;
+
+      if (dsc->u.block.before)
+        pc_stored_at += 4;
+    }
+  else
+    {
+      pc_stored_at = dsc->u.block.xfer_addr;
+      
+      if (dsc->u.block.before)
+        pc_stored_at -= 4;
+    }
+  
+  pc_val = read_memory_unsigned_integer (pc_stored_at, 4);
+  stm_insn_addr = dsc->scratch_base;
+  offset = pc_val - stm_insn_addr;
+  
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: detected PC offset %.8lx for "
+			"STM instruction\n", offset);
+
+  /* Rewrite the stored PC to the proper value for the non-displaced original
+     instruction.  */
+  write_memory_unsigned_integer (pc_stored_at, 4, dsc->insn_addr + offset);
+}
+
+/* Clean up an LDM which includes the PC in the register list.  We clumped all
+   the registers in the transferred list into a contiguous range r0...rX (to
+   avoid loading PC directly and losing control of the debugged program), so we
+   must undo that here.  */
+
+static void
+cleanup_block_load_pc (struct regcache *regs,
+		       struct displaced_step_closure *dsc)
+{
+  ULONGEST from = dsc->insn_addr;
+  unsigned long status = displaced_read_reg (regs, from, ARM_PS_REGNUM);
+  int load_executed = condition_true (dsc->u.block.cond, status), i;
+  unsigned int mask = dsc->u.block.regmask, write_reg = 15;
+  unsigned int regs_loaded = bitcount (mask);
+  unsigned int num_to_shuffle = regs_loaded, clobbered;
+  
+  /* The method employed here will fail if the register list is fully populated
+     (we need to avoid loading PC directly).  */
+  gdb_assert (num_to_shuffle < 16);
+  
+  if (!load_executed)
+    return;
+  
+  clobbered = (1 << num_to_shuffle) - 1;
+  
+  while (num_to_shuffle > 0)
+    {
+      if ((mask & (1 << write_reg)) != 0)
+        {
+	  unsigned int read_reg = num_to_shuffle - 1;
+	  
+	  if (read_reg != write_reg)
+	    {
+	      ULONGEST rval = displaced_read_reg (regs, from, read_reg);
+	      displaced_write_reg (regs, dsc, write_reg, rval, LOAD_WRITE_PC);
+	      if (debug_displaced)
+	        fprintf_unfiltered (gdb_stdlog, _("displaced: LDM: move "
+				    "loaded register r%d to r%d\n"), read_reg,
+				    write_reg);
+	    }
+	  else if (debug_displaced)
+	    fprintf_unfiltered (gdb_stdlog, _("displaced: LDM: register "
+				"r%d already in the right place\n"),
+				write_reg);
+
+	  clobbered &= ~(1 << write_reg);
+	  
+	  num_to_shuffle--;
+	}
+
+      write_reg--;
+    }
+  
+  /* Restore any registers we scribbled over.  */
+  for (write_reg = 0; clobbered != 0; write_reg++)
+    {
+      if ((clobbered & (1 << write_reg)) != 0)
+        {
+	  displaced_write_reg (regs, dsc, write_reg, dsc->tmp[write_reg],
+			       CANNOT_WRITE_PC);
+	  if (debug_displaced)
+	    fprintf_unfiltered (gdb_stdlog, _("displaced: LDM: restored "
+				"clobbered register r%d\n"), write_reg);
+	  clobbered &= ~(1 << write_reg);
+	}
+    }
+  
+  /* Perform register writeback manually.  */
+  if (dsc->u.block.writeback)
+    {
+      ULONGEST new_rn_val = dsc->u.block.xfer_addr;
+      
+      if (dsc->u.block.increment)
+        new_rn_val += regs_loaded * 4;
+      else
+	new_rn_val -= regs_loaded * 4;
+      
+      displaced_write_reg (regs, dsc, dsc->u.block.rn, new_rn_val,
+			   CANNOT_WRITE_PC);
+    }
+}
+
+/* Handle ldm/stm, apart from some tricky cases which are unlikely to occur
+   in user-level code (in particular exception return, ldm rn, {...pc}^).  */
+
+static int
+copy_block_xfer (unsigned long insn, struct regcache *regs,
+		 struct displaced_step_closure *dsc)
+{
+  int load = bit (insn, 20);
+  int user = bit (insn, 22);
+  int increment = bit (insn, 23);
+  int before = bit (insn, 24);
+  int writeback = bit (insn, 21);
+  int rn = bits (insn, 16, 19);
+  CORE_ADDR from = dsc->insn_addr;
+
+  /* Block transfers which don't mention PC can be run directly out-of-line.  */
+  if (rn != 15 && (insn & 0x8000) == 0)
+    return copy_unmodified (insn, "ldm/stm", dsc);
+
+  if (rn == 15)
+    {
+      warning (_("displaced: Unpredictable LDM or STM with base register r15"));
+      return copy_unmodified (insn, "unpredictable ldm/stm", dsc);
+    }
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying block transfer insn "
+			"%.8lx\n", insn);
+
+  dsc->u.block.xfer_addr = displaced_read_reg (regs, from, rn);
+  dsc->u.block.rn = rn;
+
+  dsc->u.block.load = load;
+  dsc->u.block.user = user;
+  dsc->u.block.increment = increment;
+  dsc->u.block.before = before;
+  dsc->u.block.writeback = writeback;
+  dsc->u.block.cond = bits (insn, 28, 31);
+
+  dsc->u.block.regmask = insn & 0xffff;
+
+  if (load)
+    {
+      if ((insn & 0xffff) == 0xffff)
+	{
+	  /* LDM with a fully-populated register list.  This case is
+             particularly tricky.  Implement for now by fully emulating the
+	     instruction (which might not behave perfectly in all cases, but
+	     these instructions should be rare enough for that not to matter
+	     too much).  */
+	  dsc->modinsn[0] = ARM_NOP;
+
+	  dsc->cleanup = &cleanup_block_load_all;
+	}
+      else
+	{
+	  /* LDM of a list of registers which includes PC.  Implement by
+             rewriting the list of registers to be transferred into a
+	     contiguous chunk r0...rX before doing the transfer, then shuffling
+	     registers into the correct places in the cleanup routine.  */
+	  unsigned int regmask = insn & 0xffff;
+	  unsigned int num_in_list = bitcount (regmask), new_regmask, bit = 1;
+	  unsigned int to = 0, from = 0, i, new_rn;
+
+	  for (i = 0; i < num_in_list; i++)
+	    dsc->tmp[i] = displaced_read_reg (regs, from, i);
+
+	  /* Writeback makes things complicated.  We need to avoid clobbering
+	     the base register with one of the registers in our modified
+	     register list, but just using a different register can't work in
+	     all cases, e.g.:
+
+	       ldm r14!, {r0-r13,pc}
+
+	     which would need to be rewritten as:
+
+	       ldm rN!, {r0-r14}
+
+	     but that can't work, because there's no free register for N.
+
+	     Solve this by turning off the writeback bit, and emulating
+	     writeback manually in the cleanup routine.  */
+	      
+	  if (writeback)
+	    insn &= ~(1 << 21);
+
+	  new_regmask = (1 << num_in_list) - 1;
+
+	  if (debug_displaced)
+	    fprintf_unfiltered (gdb_stdlog, _("displaced: LDM r%d%s, "
+				"{..., pc}: original reg list %.4x, modified "
+				"list %.4x\n"), rn, writeback ? "!" : "",
+				(int) insn & 0xffff, new_regmask);
+
+	  dsc->modinsn[0] = (insn & ~0xffff) | (new_regmask & 0xffff);
+
+	  dsc->cleanup = &cleanup_block_load_pc;
+	}
+    }
+  else
+    {
+      /* STM of a list of registers which includes PC.  Run the instruction
+	 as-is, but out of line: this will store the wrong value for the PC,
+	 so we must manually fix up the memory in the cleanup routine. 
+	 Doing things this way has the advantage that we can auto-detect
+	 the offset of the PC write (which is architecture-dependent) in
+	 the cleanup routine.  */
+      dsc->modinsn[0] = insn;
+
+      dsc->cleanup = &cleanup_block_store_pc;
+    }
+
+  return 0;
+}
+
+/* Cleanup/copy SVC (SWI) instructions.  */
+
+static void
+cleanup_svc (struct regcache *regs, struct displaced_step_closure *dsc)
+{
+  CORE_ADDR from = dsc->insn_addr;
+  CORE_ADDR to = dsc->tmp[0];
+  ULONGEST pc;
+
+  /* Note: we want the real PC, so don't use displaced_read_reg here.  */
+  regcache_cooked_read_unsigned (regs, ARM_PC_REGNUM, &pc);
+
+  displaced_write_reg (regs, dsc, ARM_PC_REGNUM, from + 4, BRANCH_WRITE_PC);
+}
+
+static int
+copy_svc (unsigned long insn, CORE_ADDR to, struct regcache *regs,
+	  struct displaced_step_closure *dsc)
+{
+  CORE_ADDR from = dsc->insn_addr;
+  unsigned int svc_number;
+
+  if (debug_displaced)
+
+  svc_number = displaced_read_reg (regs, from, 7);
+  
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying svc insn %.8lx "
+			"(r7 = %d)\n", insn, svc_number);
+
+  switch (svc_number)
+    {
+    case 119:
+    case 173:
+      warning (_("displaced: Apparently single-stepping sigreturn SVC call. "
+		 "This might not work properly!"));
+    }
+
+  /* Preparation: tmp[0] <- to.
+     Insn: unmodified svc.
+     Cleanup: pc <- insn_addr + 4.  */
+
+  dsc->tmp[0] = to;
+  dsc->modinsn[0] = insn;
+
+  dsc->cleanup = &cleanup_svc;
+  /* Pretend we wrote to the PC, so cleanup doesn't set PC to the next
+     instruction.  */
+  dsc->wrote_to_pc = 1;
+
+  return 0;
+}
+
+/* Copy undefined instructions.  */
+
+static int
+copy_undef (unsigned long insn, struct displaced_step_closure *dsc)
+{
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying undefined insn %.8lx\n",
+			insn);
+
+  dsc->modinsn[0] = insn;
+
+  return 0;
+}
+
+/* Copy unpredictable instructions.  */
+
+static int
+copy_unpred (unsigned long insn, struct displaced_step_closure *dsc)
+{
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copying unpredictable insn "
+			"%.8lx\n", insn);
+
+  dsc->modinsn[0] = insn;
+
+  return 0;
+}
+
+/* The decode_* functions are instruction decoding helpers.  They mostly follow
+   the presentation in the ARM ARM.  */
+
+static int
+decode_misc_memhint_neon (unsigned long insn, struct regcache *regs,
+			  struct displaced_step_closure *dsc)
+{
+  unsigned int op1 = bits (insn, 20, 26), op2 = bits (insn, 4, 7);
+  unsigned int rn = bits (insn, 16, 19);
+
+  if (op1 == 0x10 && (op2 & 0x2) == 0x0 && (rn & 0xe) == 0x0)
+    return copy_unmodified (insn, "cps", dsc);
+  else if (op1 == 0x10 && op2 == 0x0 && (rn & 0xe) == 0x1)
+    return copy_unmodified (insn, "setend", dsc);
+  else if ((op1 & 0x60) == 0x20)
+    return copy_unmodified (insn, "neon dataproc", dsc);
+  else if ((op1 & 0x71) == 0x40)
+    return copy_unmodified (insn, "neon elt/struct load/store", dsc);
+  else if ((op1 & 0x77) == 0x41)
+    return copy_unmodified (insn, "unallocated mem hint", dsc);
+  else if ((op1 & 0x77) == 0x45)
+    return copy_preload (insn, regs, dsc);  /* pli.  */
+  else if ((op1 & 0x77) == 0x51)
+    {
+      if (rn != 0xf)
+        return copy_preload (insn, regs, dsc);  /* pld/pldw.  */
+      else
+        return copy_unpred (insn, dsc);
+    }
+  else if ((op1 & 0x77) == 0x55)
+    return copy_preload (insn, regs, dsc);  /* pld/pldw.  */
+  else if (op1 == 0x57)
+    switch (op2)
+      {
+      case 0x1: return copy_unmodified (insn, "clrex", dsc);
+      case 0x4: return copy_unmodified (insn, "dsb", dsc);
+      case 0x5: return copy_unmodified (insn, "dmb", dsc);
+      case 0x6: return copy_unmodified (insn, "isb", dsc);
+      default: return copy_unpred (insn, dsc);
+      }
+  else if ((op1 & 0x63) == 0x43)
+    return copy_unpred (insn, dsc);
+  else if ((op2 & 0x1) == 0x0)
+    switch (op1 & ~0x80)
+      {
+      case 0x61:
+	return copy_unmodified (insn, "unallocated mem hint", dsc);
+      case 0x65:
+	return copy_preload_reg (insn, regs, dsc);  /* pli reg.  */
+      case 0x71: case 0x75:
+	return copy_preload_reg (insn, regs, dsc);  /* pld/pldw reg.  */
+      case 0x63: case 0x67: case 0x73: case 0x77:
+	return copy_unpred (insn, dsc);
+      default:
+	return copy_undef (insn, dsc);
+      }
+  else
+    return copy_undef (insn, dsc);  /* Probably unreachable.  */
+}
+
+static int
+decode_unconditional (unsigned long insn, struct regcache *regs,
+		      struct displaced_step_closure *dsc)
+{
+  if (bit (insn, 27) == 0)
+    return decode_misc_memhint_neon (insn, regs, dsc);
+  /* Switch on bits: 0bxxxxx321xxx0xxxxxxxxxxxxxxxxxxxx.  */
+  else switch (((insn & 0x7000000) >> 23) | ((insn & 0x100000) >> 20))
+    {
+    case 0x0: case 0x2:
+      return copy_unmodified (insn, "srs", dsc);
+
+    case 0x1: case 0x3:
+      return copy_unmodified (insn, "rfe", dsc);
+
+    case 0x4: case 0x5: case 0x6: case 0x7:
+      return copy_b_bl_blx (insn, regs, dsc);
+
+    case 0x8:
+      switch ((insn & 0xe00000) >> 21)
+	{
+	case 0x1: case 0x3: case 0x4: case 0x5: case 0x6: case 0x7:
+	  return copy_copro_load_store (insn, regs, dsc); /* stc/stc2.  */
+
+	case 0x2:
+	  return copy_unmodified (insn, "mcrr/mcrr2", dsc);
+
+	default:
+	  return copy_undef (insn, dsc);
+	}
+
+    case 0x9:
+      {
+        int rn_f = (bits (insn, 16, 19) == 0xf);
+	switch ((insn & 0xe00000) >> 21)
+	  {
+	  case 0x1: case 0x3:
+	    /* ldc/ldc2 imm (undefined for rn == pc).  */
+	    return rn_f ? copy_undef (insn, dsc)
+			: copy_copro_load_store (insn, regs, dsc);
+
+	  case 0x2:
+	    return copy_unmodified (insn, "mrrc/mrrc2", dsc);
+
+	  case 0x4: case 0x5: case 0x6: case 0x7:
+	    /* ldc/ldc2 lit (undefined for rn != pc).  */
+	    return rn_f ? copy_copro_load_store (insn, regs, dsc)
+			: copy_undef (insn, dsc);
+
+	  default:
+	    return copy_undef (insn, dsc);
+	  }
+      }
+
+    case 0xa:
+      return copy_unmodified (insn, "stc/stc2", dsc);
+
+    case 0xb:
+      if (bits (insn, 16, 19) == 0xf)
+        return copy_copro_load_store (insn, regs, dsc);  /* ldc/ldc2 lit.  */
+      else
+        return copy_undef (insn, dsc);
+
+    case 0xc:
+      if (bit (insn, 4))
+	return copy_unmodified (insn, "mcr/mcr2", dsc);
+      else
+	return copy_unmodified (insn, "cdp/cdp2", dsc);
+
+    case 0xd:
+      if (bit (insn, 4))
+        return copy_unmodified (insn, "mrc/mrc2", dsc);
+      else
+	return copy_unmodified (insn, "cdp/cdp2", dsc);
+
+    default:
+      return copy_undef (insn, dsc);
+    }
+}
+
+/* Decode miscellaneous instructions in dp/misc encoding space.  */
+
+static int
+decode_miscellaneous (unsigned long insn, struct regcache *regs,
+		      struct displaced_step_closure *dsc)
+{
+  unsigned int op2 = bits (insn, 4, 6);
+  unsigned int op = bits (insn, 21, 22);
+  unsigned int op1 = bits (insn, 16, 19);
+
+  switch (op2)
+    {
+    case 0x0:
+      return copy_unmodified (insn, "mrs/msr", dsc);
+
+    case 0x1:
+      if (op == 0x1)  /* bx.  */
+        return copy_bx_blx_reg (insn, regs, dsc);
+      else if (op == 0x3)
+        return copy_unmodified (insn, "clz", dsc);
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x2:
+      if (op == 0x1)
+        return copy_unmodified (insn, "bxj", dsc);  /* Not really supported.  */
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x3:
+      if (op == 0x1)
+        return copy_bx_blx_reg (insn, regs, dsc);  /* blx register.  */
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x5:
+      return copy_unmodified (insn, "saturating add/sub", dsc);
+
+    case 0x7:
+      if (op == 0x1)
+        return copy_unmodified (insn, "bkpt", dsc);
+      else if (op == 0x3)
+        return copy_unmodified (insn, "smc", dsc);  /* Not really supported.  */
+
+    default:
+      return copy_undef (insn, dsc);
+    }
+}
+
+static int
+decode_dp_misc (unsigned long insn, struct regcache *regs,
+		struct displaced_step_closure *dsc)
+{
+  if (bit (insn, 25))
+    switch (bits (insn, 20, 24))
+      {
+      case 0x10:
+        return copy_unmodified (insn, "movw", dsc);
+
+      case 0x14:
+        return copy_unmodified (insn, "movt", dsc);
+
+      case 0x12: case 0x16:
+        return copy_unmodified (insn, "msr imm", dsc);
+
+      default:
+        return copy_alu_imm (insn, regs, dsc);
+      }
+  else
+    {
+      unsigned long op1 = bits (insn, 20, 24), op2 = bits (insn, 4, 7);
+
+      if ((op1 & 0x19) != 0x10 && (op2 & 0x1) == 0x0)
+        return copy_alu_reg (insn, regs, dsc);
+      else if ((op1 & 0x19) != 0x10 && (op2 & 0x9) == 0x1)
+        return copy_alu_shifted_reg (insn, regs, dsc);
+      else if ((op1 & 0x19) == 0x10 && (op2 & 0x8) == 0x0)
+        return decode_miscellaneous (insn, regs, dsc);
+      else if ((op1 & 0x19) == 0x10 && (op2 & 0x9) == 0x8)
+        return copy_unmodified (insn, "halfword mul/mla", dsc);
+      else if ((op1 & 0x10) == 0x00 && op2 == 0x9)
+        return copy_unmodified (insn, "mul/mla", dsc);
+      else if ((op1 & 0x10) == 0x10 && op2 == 0x9)
+        return copy_unmodified (insn, "synch", dsc);
+      else if (op2 == 0xb || (op2 & 0xd) == 0xd)
+        /* 2nd arg means "unpriveleged".  */
+        return copy_extra_ld_st (insn, (op1 & 0x12) == 0x02, regs, dsc);
+    }
+
+  /* Should be unreachable.  */
+  return 1;
+}
+
+static int
+decode_ld_st_word_ubyte (unsigned long insn, struct regcache *regs,
+			 struct displaced_step_closure *dsc)
+{
+  int a = bit (insn, 25), b = bit (insn, 4);
+  unsigned long op1 = bits (insn, 20, 24);
+  int rn_f = bits (insn, 16, 19) == 0xf;
+
+  if ((!a && (op1 & 0x05) == 0x00 && (op1 & 0x17) != 0x02)
+      || (a && (op1 & 0x05) == 0x00 && (op1 & 0x17) != 0x02 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 0, 0, 0);
+  else if ((!a && (op1 & 0x17) == 0x02)
+           || (a && (op1 & 0x17) == 0x02 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 0, 0, 1);
+  else if ((!a && (op1 & 0x05) == 0x01 && (op1 & 0x17) != 0x03)
+           || (a && (op1 & 0x05) == 0x01 && (op1 & 0x17) != 0x03 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 1, 0, 0);
+  else if ((!a && (op1 & 0x17) == 0x03)
+	   || (a && (op1 & 0x17) == 0x03 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 1, 0, 1);
+  else if ((!a && (op1 & 0x05) == 0x04 && (op1 & 0x17) != 0x06)
+           || (a && (op1 & 0x05) == 0x04 && (op1 & 0x17) != 0x06 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 0, 1, 0);
+  else if ((!a && (op1 & 0x17) == 0x06)
+	   || (a && (op1 & 0x17) == 0x06 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 0, 1, 1);
+  else if ((!a && (op1 & 0x05) == 0x05 && (op1 & 0x17) != 0x07)
+	   || (a && (op1 & 0x05) == 0x05 && (op1 & 0x17) != 0x07 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 1, 1, 0);
+  else if ((!a && (op1 & 0x17) == 0x07)
+	   || (a && (op1 & 0x17) == 0x07 && !b))
+    return copy_ldr_str_ldrb_strb (insn, regs, dsc, 1, 1, 1);
+
+  /* Should be unreachable.  */
+  return 1;
+}
+
+static int
+decode_media (unsigned long insn, struct displaced_step_closure *dsc)
+{
+  switch (bits (insn, 20, 24))
+    {
+    case 0x00: case 0x01: case 0x02: case 0x03:
+      return copy_unmodified (insn, "parallel add/sub signed", dsc);
+
+    case 0x04: case 0x05: case 0x06: case 0x07:
+      return copy_unmodified (insn, "parallel add/sub unsigned", dsc);
+
+    case 0x08: case 0x09: case 0x0a: case 0x0b:
+    case 0x0c: case 0x0d: case 0x0e: case 0x0f:
+      return copy_unmodified (insn, "decode/pack/unpack/saturate/reverse", dsc);
+
+    case 0x18:
+      if (bits (insn, 5, 7) == 0)  /* op2.  */
+        {
+	  if (bits (insn, 12, 15) == 0xf)
+	    return copy_unmodified (insn, "usad8", dsc);
+	  else
+	    return copy_unmodified (insn, "usada8", dsc);
+	}
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x1a: case 0x1b:
+      if (bits (insn, 5, 6) == 0x2)  /* op2[1:0].  */
+	return copy_unmodified (insn, "sbfx", dsc);
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x1c: case 0x1d:
+      if (bits (insn, 5, 6) == 0x0)  /* op2[1:0].  */
+        {
+	  if (bits (insn, 0, 3) == 0xf)
+	    return copy_unmodified (insn, "bfc", dsc);
+	  else
+	    return copy_unmodified (insn, "bfi", dsc);
+	}
+      else
+        return copy_undef (insn, dsc);
+
+    case 0x1e: case 0x1f:
+      if (bits (insn, 5, 6) == 0x2)  /* op2[1:0].  */
+        return copy_unmodified (insn, "ubfx", dsc);
+      else
+        return copy_undef (insn, dsc);
+    }
+
+  /* Should be unreachable.  */
+  return 1;
+}
+
+static int
+decode_b_bl_ldmstm (unsigned long insn, struct regcache *regs,
+		    struct displaced_step_closure *dsc)
+{
+  if (bit (insn, 25))
+    return copy_b_bl_blx (insn, regs, dsc);
+  else
+    return copy_block_xfer (insn, regs, dsc);
+}
+
+static int
+decode_ext_reg_ld_st (unsigned long insn, struct regcache *regs,
+		      struct displaced_step_closure *dsc)
+{
+  unsigned int opcode = bits (insn, 20, 24);
+
+  switch (opcode)
+    {
+    case 0x04: case 0x05:  /* VFP/Neon mrrc/mcrr.  */
+      return copy_unmodified (insn, "vfp/neon mrrc/mcrr", dsc);
+
+    case 0x08: case 0x0a: case 0x0c: case 0x0e:
+    case 0x12: case 0x16:
+      return copy_unmodified (insn, "vfp/neon vstm/vpush", dsc);
+
+    case 0x09: case 0x0b: case 0x0d: case 0x0f:
+    case 0x13: case 0x17:
+      return copy_unmodified (insn, "vfp/neon vldm/vpop", dsc);
+
+    case 0x10: case 0x14: case 0x18: case 0x1c:  /* vstr.  */
+    case 0x11: case 0x15: case 0x19: case 0x1d:  /* vldr.  */
+      /* Note: no writeback for these instructions.  Bit 25 will always be
+	 zero though (via caller), so the following works OK.  */
+      return copy_copro_load_store (insn, regs, dsc);
+    }
+
+  /* Should be unreachable.  */
+  return 1;
+}
+
+static int
+decode_svc_copro (unsigned long insn, CORE_ADDR to, struct regcache *regs,
+		  struct displaced_step_closure *dsc)
+{
+  unsigned int op1 = bits (insn, 20, 25);
+  int op = bit (insn, 4);
+  unsigned int coproc = bits (insn, 8, 11);
+  unsigned int rn = bits (insn, 16, 19);
+
+  if ((op1 & 0x20) == 0x00 && (op1 & 0x3a) != 0x00 && (coproc & 0xe) == 0xa)
+    return decode_ext_reg_ld_st (insn, regs, dsc);
+  else if ((op1 & 0x21) == 0x00 && (op1 & 0x3a) != 0x00
+	   && (coproc & 0xe) != 0xa)
+    return copy_copro_load_store (insn, regs, dsc);  /* stc/stc2.  */
+  else if ((op1 & 0x21) == 0x01 && (op1 & 0x3a) != 0x00
+	   && (coproc & 0xe) != 0xa)
+    return copy_copro_load_store (insn, regs, dsc);  /* ldc/ldc2 imm/lit.  */
+  else if ((op1 & 0x3e) == 0x00)
+    return copy_undef (insn, dsc);
+  else if ((op1 & 0x3e) == 0x04 && (coproc & 0xe) == 0xa)
+    return copy_unmodified (insn, "neon 64bit xfer", dsc);
+  else if (op1 == 0x04 && (coproc & 0xe) != 0xa)
+    return copy_unmodified (insn, "mcrr/mcrr2", dsc);
+  else if (op1 == 0x05 && (coproc & 0xe) != 0xa)
+    return copy_unmodified (insn, "mrrc/mrrc2", dsc);
+  else if ((op1 & 0x30) == 0x20 && !op)
+    {
+      if ((coproc & 0xe) == 0xa)
+	return copy_unmodified (insn, "vfp dataproc", dsc);
+      else
+        return copy_unmodified (insn, "cdp/cdp2", dsc);
+    }
+  else if ((op1 & 0x30) == 0x20 && op)
+    return copy_unmodified (insn, "neon 8/16/32 bit xfer", dsc);
+  else if ((op1 & 0x31) == 0x20 && op && (coproc & 0xe) != 0xa)
+    return copy_unmodified (insn, "mcr/mcr2", dsc);
+  else if ((op1 & 0x31) == 0x21 && op && (coproc & 0xe) != 0xa)
+    return copy_unmodified (insn, "mrc/mrc2", dsc);
+  else if ((op1 & 0x30) == 0x30)
+    return copy_svc (insn, to, regs, dsc);
+  else
+    return copy_undef (insn, dsc);  /* Possibly unreachable.  */
+}
+
+static struct displaced_step_closure *
+arm_process_displaced_insn (unsigned long insn, CORE_ADDR from, CORE_ADDR to,
+			    struct regcache *regs)
+{
+  struct displaced_step_closure *dsc
+    = xmalloc (sizeof (struct displaced_step_closure));
+  int err = 0;
+
+  /* Most displaced instructions use a 1-instruction scratch space, so set this
+     here and override below if/when necessary.  */
+  dsc->numinsns = 1;
+  dsc->insn_addr = from;
+  dsc->scratch_base = to;
+  dsc->cleanup = NULL;
+  dsc->wrote_to_pc = 0;
+
+  if ((insn & 0xf0000000) == 0xf0000000)
+    err = decode_unconditional (insn, regs, dsc);
+  else switch (((insn & 0x10) >> 4) | ((insn & 0xe000000) >> 24))
+    {
+    case 0x0: case 0x1: case 0x2: case 0x3:
+      err = decode_dp_misc (insn, regs, dsc);
+      break;
+
+    case 0x4: case 0x5: case 0x6:
+      err = decode_ld_st_word_ubyte (insn, regs, dsc);
+      break;
+
+    case 0x7:
+      err = decode_media (insn, dsc);
+      break;
+
+    case 0x8: case 0x9: case 0xa: case 0xb:
+      err = decode_b_bl_ldmstm (insn, regs, dsc);
+      break;
+
+    case 0xc: case 0xd: case 0xe: case 0xf:
+      err = decode_svc_copro (insn, to, regs, dsc);
+      break;
+    }
+
+  if (err)
+    internal_error (__FILE__, __LINE__,
+		    _("arm_process_displaced_insn: Instruction decode error"));
+
+  return dsc;
+}
+
+/* Actually set up the scratch space for a displaced instruction.  */
+
+struct displaced_step_closure *
+arm_displaced_init_closure (struct gdbarch *gdbarch, CORE_ADDR from,
+			    CORE_ADDR to, struct displaced_step_closure *dsc)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+  unsigned int i;
+
+  /* Poke modified instruction(s).  */
+  for (i = 0; i < dsc->numinsns; i++)
+    {
+      if (debug_displaced)
+        fprintf_unfiltered (gdb_stdlog, "displaced: writing insn %.8lx at "
+			    "%.8lx\n", (unsigned long) dsc->modinsn[i],
+			    (unsigned long) to + i * 4);
+      write_memory_unsigned_integer (to + i * 4, 4, dsc->modinsn[i]);
+    }
+
+  /* Put breakpoint afterwards.  */
+  write_memory (to + dsc->numinsns * 4, tdep->arm_breakpoint,
+		tdep->arm_breakpoint_size);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: copy 0x%s->0x%s: ",
+			paddr_nz (from), paddr_nz (to));
+
+  return dsc;
+}
+
+/* Entry point for copying an instruction into scratch space for displaced
+   stepping.  */
+
+struct displaced_step_closure *
+arm_displaced_step_copy_insn (struct gdbarch *gdbarch,
+			      CORE_ADDR from, CORE_ADDR to,
+			      struct regcache *regs)
+{
+  const size_t len = 4;
+  struct displaced_step_closure *dsc;
+  unsigned long insn;
+
+  if (!displaced_in_arm_mode (regs))
+    error (_("Displaced stepping is only supported in ARM mode"));
+
+  insn = read_memory_unsigned_integer (from, len);
+
+  if (debug_displaced)
+    fprintf_unfiltered (gdb_stdlog, "displaced: stepping insn %.8lx "
+			"at %.8lx\n", insn, (unsigned long) from);
+
+  dsc = arm_process_displaced_insn (insn, from, to, regs);
+
+  return arm_displaced_init_closure (gdbarch, from, to, dsc);
+}
+
+/* Entry point for cleaning things up after a displaced instruction has been
+   single-stepped.  */
+
+void
+arm_displaced_step_fixup (struct gdbarch *gdbarch,
+			  struct displaced_step_closure *dsc,
+			  CORE_ADDR from, CORE_ADDR to,
+			  struct regcache *regs)
+{
+  if (dsc->cleanup)
+    dsc->cleanup (regs, dsc);
+
+  if (!dsc->wrote_to_pc)
+    regcache_cooked_write_unsigned (regs, ARM_PC_REGNUM, dsc->insn_addr + 4);
+}
+
+
 #include "bfd-in2.h"
 #include "libcoff.h"
 
@@ -3252,6 +5079,11 @@ arm_gdbarch_init (struct gdbarch_info in
   /* On ARM targets char defaults to unsigned.  */
   set_gdbarch_char_signed (gdbarch, 0);
 
+  /* Note: for displaced stepping, this includes the breakpoint, and one word
+     of additional scratch space.  This setting isn't used for anything beside
+     displaced stepping at present.  */
+  set_gdbarch_max_insn_length (gdbarch, 4 * DISPLACED_MODIFIED_INSNS);
+
   /* This should be low enough for everything.  */
   tdep->lowest_pc = 0x20;
   tdep->jb_pc = -1;	/* Longjump support not enabled by default.  */
--- .pc/displaced-stepping/gdb/arm-tdep.h	2009-05-15 16:05:07.000000000 -0700
+++ gdb/arm-tdep.h	2009-05-16 10:16:52.000000000 -0700
@@ -172,11 +172,96 @@ struct gdbarch_tdep
   struct regset *gregset, *fpregset;
 };
 
+/* Structures used for displaced stepping.  */
+
+/* The maximum number of temporaries available for displaced instructions.  */
+#define DISPLACED_TEMPS			16
+/* The maximum number of modified instructions generated for one single-stepped
+   instruction, including the breakpoint (usually at the end of the instruction
+   sequence) and any scratch words, etc.  */
+#define DISPLACED_MODIFIED_INSNS	8
+
+struct displaced_step_closure
+{
+  ULONGEST tmp[DISPLACED_TEMPS];
+  int rd;
+  int wrote_to_pc;
+  union
+  {
+    struct
+    {
+      int xfersize;
+      int rn;			   /* Writeback register.  */
+      unsigned int immed : 1;      /* Offset is immediate.  */
+      unsigned int writeback : 1;  /* Perform base-register writeback.  */
+      unsigned int restore_r4 : 1; /* Used r4 as scratch.  */
+    } ldst;
+
+    struct
+    {
+      unsigned long dest;
+      unsigned int link : 1;
+      unsigned int exchange : 1;
+      unsigned int cond : 4;
+    } branch;
+
+    struct
+    {
+      unsigned int regmask;
+      int rn;
+      CORE_ADDR xfer_addr;
+      unsigned int load : 1;
+      unsigned int user : 1;
+      unsigned int increment : 1;
+      unsigned int before : 1;
+      unsigned int writeback : 1;
+      unsigned int cond : 4;
+    } block;
+
+    struct
+    {
+      unsigned int immed : 1;
+    } preload;
+  } u;
+  unsigned long modinsn[DISPLACED_MODIFIED_INSNS];
+  int numinsns;
+  CORE_ADDR insn_addr;
+  CORE_ADDR scratch_base;
+  void (*cleanup) (struct regcache *, struct displaced_step_closure *);
+};
+
+/* Values for the WRITE_PC argument to displaced_write_reg.  If the register
+   write may write to the PC, specifies the way the CPSR T bit, etc. is
+   modified by the instruction.  */
+
+enum pc_write_style
+{
+  BRANCH_WRITE_PC,
+  BX_WRITE_PC,
+  LOAD_WRITE_PC,
+  ALU_WRITE_PC,
+  CANNOT_WRITE_PC
+};
+
+struct displaced_step_closure *
+  arm_displaced_init_closure (struct gdbarch *gdbarch, CORE_ADDR from,
+			      CORE_ADDR to, struct displaced_step_closure *dsc);
+ULONGEST displaced_read_reg (struct regcache *regs, CORE_ADDR from, int regno);
+void displaced_write_reg (struct regcache *regs,
+			  struct displaced_step_closure *dsc, int regno,
+			  ULONGEST val, enum pc_write_style write_pc);
 
 CORE_ADDR arm_skip_stub (struct frame_info *, CORE_ADDR);
 CORE_ADDR arm_get_next_pc (struct frame_info *, CORE_ADDR);
 int arm_software_single_step (struct frame_info *);
 
+extern struct displaced_step_closure *
+  arm_displaced_step_copy_insn (struct gdbarch *, CORE_ADDR, CORE_ADDR,
+				struct regcache *);
+extern void arm_displaced_step_fixup (struct gdbarch *,
+				      struct displaced_step_closure *,
+				      CORE_ADDR, CORE_ADDR, struct regcache *);
+
 /* Functions exported from armbsd-tdep.h.  */
 
 /* Return the appropriate register set for the core section identified

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]