This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gold][aarch64] Patch for erratum-843419 (1 of 2 - report erratum occurrences)


Hi Cary, thanks a lot for the review. Some comments below.
Re-run and passed all the tests.

For the ease of review, I attached 2 diffs, one is against trunk, the
other is against my previous patch.

Thanks,
Han

On Wed, Apr 15, 2015 at 3:41 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> +    // "<" comparator used in ordered_map container.
> +    bool
> +    operator < (const Mapping_symbol_position& p) const
>
> For consistency, please write this as "operator<(", without spaces
> around the '<'.

Done.

>
> +  // Only erratum-fixing work needs mapping symbols, so skip this
> time consuming
> +  // processing if not fixing erratum.
> +  if (!parameters->options().fix_cortex_a53())
> +    return ;
>
> Extra space.

Done.

>
> +  // Read the symbol table section header.
> +  const unsigned int symtab_shndx = this->symtab_shndx();
> +  elfcpp::Shdr<size, big_endian>
> +      symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
> +  gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
>
> It seems wasteful to set up these views of the symbol table, when
> do_count_local_symbols in the parent class has already done all that.
> Both arm and mips targets also do something like this, so a refactor
> of the code could benefit all three targets. I won't ask you to do this,
> but I'll make a note to look at this myself later.

Thanks!

>
> +  // Read the local symbols.
> +  const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
>
> Space after '='.
>
> +  while (p != this->mapping_symbol_info_.end() &&
> +        p->first.shndx_ == shndx)
> +    {
> +      typename Mapping_symbol_info::const_iterator next =
> +       this->mapping_symbol_info_.upper_bound(p->first);
>
> Isn't this equivalent to "next = p; ++next"? With map, you'll only
> have a single entry per position, and you know that p->first is
> already in the map. It seems to me that you're doing an unnecessary
> binary search for every element of the map.

Yes, exactly, this is an unnecessary binary search.

>
> I'd suggest doing something like:
>
>       typename Mapping_symbol_info::const_iterator prev = p;
>       ++p;
>
> At the top of the loop, remove "p = next;" at the bottom, then replace
> "p" with "prev" and "next" with "p" in the body of the loop.

Done, yes, that's much clearer.

>
> +      // The 3rd insn is a load or store instruction from the "Load/store
> +      // register (unsigned immediate)" encoding class, using Rn as the
> +      // base address register.
> +      if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
> +         Insn_utilities::aarch64_rn(insn3)
> +         == Insn_utilities::aarch64_rd(insn1))
>
> Add parentheses around the '==' operation, and indent accordingly.

Done.

>
> +  // Erratum sequence is at least 3 insns.
> +  if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
> +    return ;
>
> Extra space. This test is redundant -- you'll drop out of the for loop
> below immediately if the span isn't long enough. It's also wrong -- it
> will fail if there are exactly 3 instructions before the end.

Removed this part.

>
> +  Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
> +  unsigned int insn_index = 0;
> +  for (section_size_type i = span_start; i < span_end;
> +       i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
> +    {
> +      if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
> +       break;
>
> Should be "<=".

Done.

>
> Given the constraints on the address, it's inefficient to loop over
> every address. I'd suggest instead something like this:
>
>   if (output_address & 0x03 != 0)

'output_address' is output section's output_address, so I use
'output_address + span_start' here.

>     return;
>   section_size_type offset = 0;
>   section_size_type span_length = span_end - span_start;
>   // The first instruction must be at page offset 0xFF8 or 0xFFC.
>   unsigned int page_offset = output_address & 0xfff;

Same as above, 'output_address' is output section's output_address, so
I use 'output_address + span_start' here.
If there a consensus/guideline on whether to use '0xfff' or '0xFFF'?

>   if (page_offset < 0xff8)
>     offset = 0xff8 - page_offset;
>   while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
>     {
>       Insntype* ip = reinterpret_cast<Insntype*>(input_view + offset);
>       Insntype insn1 = ip[0];
>       if (!Insn_utilities::is_adrp(insn1))
>         continue;

Need to update offset before continue, so I reverse the condition and
wrap everything below inside.

>       Insntype insn2 = ip[1];
>       Insntype insn3 = ip[2];
>
>       // ...
>
>       // Advance to next candidate instruction.
>       // We only consider instruction sequences starting at
>       // a page offset of 0xff8 or 0xffc.
>       page_offset = (output_address + offset) & 0xfff;

I use 'output_address + span_start + offset' here.

>       if (page_offset == 0xff8)
>         offset += 4;
>       else  // (page_offset == 0xffc)
>         offset += 0xffc;
>     }
>
> (I used '4' instead of Insn_utilities::BYTES_PER_INSN here, because we
> explicitly want to advance from 0xff8 to 0xffc.)
>
> +      if (is_erratum_843419_sequence(insn1, insn2, insn3))
> +       do_report = true;
> +      else if (span_end - i > 16)
>
> Should be '>='. Do you want "4 * Insn_utilities::BYTES_PER_INSN"
> instead of 16? (I'm not sold on the usefulness of the symbolic constant;
> it seems we're pretty hard-wired to a value of 4.)
>
> +  return ;

Done.

>
> Extra space.
>
> +  DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
> +      N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
> +      N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
>
> The plural of "erratum" is "errata".

Done.

>
> -cary

-- 
Han Shen

On Wed, Apr 15, 2015 at 3:41 PM, Cary Coutant <ccoutant@gmail.com> wrote:
> +    // "<" comparator used in ordered_map container.
> +    bool
> +    operator < (const Mapping_symbol_position& p) const
>
> For consistency, please write this as "operator<(", without spaces
> around the '<'.
>
> +  // Only erratum-fixing work needs mapping symbols, so skip this
> time consuming
> +  // processing if not fixing erratum.
> +  if (!parameters->options().fix_cortex_a53())
> +    return ;
>
> Extra space.
>
> +  // Read the symbol table section header.
> +  const unsigned int symtab_shndx = this->symtab_shndx();
> +  elfcpp::Shdr<size, big_endian>
> +      symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
> +  gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
>
> It seems wasteful to set up these views of the symbol table, when
> do_count_local_symbols in the parent class has already done all that.
> Both arm and mips targets also do something like this, so a refactor
> of the code could benefit all three targets. I won't ask you to do this,
> but I'll make a note to look at this myself later.
>
> +  // Read the local symbols.
> +  const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
>
> Space after '='.
>
> +  while (p != this->mapping_symbol_info_.end() &&
> +        p->first.shndx_ == shndx)
> +    {
> +      typename Mapping_symbol_info::const_iterator next =
> +       this->mapping_symbol_info_.upper_bound(p->first);
>
> Isn't this equivalent to "next = p; ++next"? With map, you'll only
> have a single entry per position, and you know that p->first is
> already in the map. It seems to me that you're doing an unnecessary
> binary search for every element of the map.
>
> I'd suggest doing something like:
>
>       typename Mapping_symbol_info::const_iterator prev = p;
>       ++p;
>
> At the top of the loop, remove "p = next;" at the bottom, then replace
> "p" with "prev" and "next" with "p" in the body of the loop.
>
> +      // The 3rd insn is a load or store instruction from the "Load/store
> +      // register (unsigned immediate)" encoding class, using Rn as the
> +      // base address register.
> +      if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
> +         Insn_utilities::aarch64_rn(insn3)
> +         == Insn_utilities::aarch64_rd(insn1))
>
> Add parentheses around the '==' operation, and indent accordingly.
>
> +  // Erratum sequence is at least 3 insns.
> +  if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
> +    return ;
>
> Extra space. This test is redundant -- you'll drop out of the for loop
> below immediately if the span isn't long enough. It's also wrong -- it
> will fail if there are exactly 3 instructions before the end.
>
> +  Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
> +  unsigned int insn_index = 0;
> +  for (section_size_type i = span_start; i < span_end;
> +       i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
> +    {
> +      if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
> +       break;
>
> Should be "<=".
>
> Given the constraints on the address, it's inefficient to loop over
> every address. I'd suggest instead something like this:
>
>   if (output_address & 0x03 != 0)
>     return;
>   section_size_type offset = 0;
>   section_size_type span_length = span_end - span_start;
>   // The first instruction must be at page offset 0xFF8 or 0xFFC.
>   unsigned int page_offset = output_address & 0xfff;
>   if (page_offset < 0xff8)
>     offset = 0xff8 - page_offset;
>   while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
>     {
>       Insntype* ip = reinterpret_cast<Insntype*>(input_view + offset);
>       Insntype insn1 = ip[0];
>       if (!Insn_utilities::is_adrp(insn1))
>         continue;
>       Insntype insn2 = ip[1];
>       Insntype insn3 = ip[2];
>
>       // ...
>
>       // Advance to next candidate instruction.
>       // We only consider instruction sequences starting at
>       // a page offset of 0xff8 or 0xffc.
>       page_offset = (output_address + offset) & 0xfff;
>       if (page_offset == 0xff8)
>         offset += 4;
>       else  // (page_offset == 0xffc)
>         offset += 0xffc;
>     }
>
> (I used '4' instead of Insn_utilities::BYTES_PER_INSN here, because we
> explicitly want to advance from 0xff8 to 0xffc.)
>
> +      if (is_erratum_843419_sequence(insn1, insn2, insn3))
> +       do_report = true;
> +      else if (span_end - i > 16)
>
> Should be '>='. Do you want "4 * Insn_utilities::BYTES_PER_INSN"
> instead of 16? (I'm not sold on the usefulness of the symbolic constant;
> it seems we're pretty hard-wired to a value of 4.)
>
> +  return ;
>
> Extra space.
>
> +  DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
> +      N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
> +      N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
>
> The plural of "erratum" is "errata".
>
> -cary



-- 
Han Shen
diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index dea64c0..f0b7ca8 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -21,10 +21,11 @@
 // MA 02110-1301, USA.
 
 #include "gold.h"
 
 #include <cstring>
+#include <map>
 
 #include "elfcpp.h"
 #include "dwarf.h"
 #include "parameters.h"
 #include "reloc.h"
@@ -63,10 +64,277 @@ template<int size, bool big_endian>
 class Target_aarch64;
 
 template<int size, bool big_endian>
 class AArch64_relocate_functions;
 
+// Utility class dealing with insns. This is ported from macros in
+// bfd/elfnn-aarch64.cc, but wrapped inside a class as static members. This
+// class is used in erratum sequence scanning.
+
+template<bool big_endian>
+class AArch64_insn_utilities
+{
+public:
+  typedef typename elfcpp::Swap<32, big_endian>::Valtype Insntype;
+
+  static const int BYTES_PER_INSN = 4;
+
+  static unsigned int
+  aarch64_bit(Insntype insn, int pos)
+  { return ((1 << pos)  & insn) >> pos; }
+
+  static unsigned int
+  aarch64_bits(Insntype insn, int pos, int l)
+  { return (insn >> pos) & ((1 << l) - 1); }
+
+  static bool
+  is_adrp(const Insntype insn)
+  { return (insn & 0x9F000000) == 0x90000000; }
+
+  static unsigned int
+  aarch64_rm(const Insntype insn)
+  { return aarch64_bits(insn, 16, 5); }
+
+  static unsigned int
+  aarch64_rn(const Insntype insn)
+  { return aarch64_bits(insn, 5, 5); }
+
+  static unsigned int
+  aarch64_rd(const Insntype insn)
+  { return aarch64_bits(insn, 0, 5); }
+
+  static unsigned int
+  aarch64_rt(const Insntype insn)
+  { return aarch64_bits(insn, 0, 5); }
+
+  static unsigned int
+  aarch64_rt2(const Insntype insn)
+  { return aarch64_bits(insn, 10, 5); }
+
+  static bool
+  aarch64_b(const Insntype insn)
+  { return (insn & 0xFC000000) == 0x14000000; }
+
+  static bool
+  aarch64_bl(const Insntype insn)
+  { return (insn & 0xFC000000) == 0x94000000; }
+
+  static bool
+  aarch64_blr(const Insntype insn)
+  { return (insn & 0xFFFFFC1F) == 0xD63F0000; }
+
+  static bool
+  aarch64_br(const Insntype insn)
+  { return (insn & 0xFFFFFC1F) == 0xD61F0000; }
+
+  // All ld/st ops.  See C4-182 of the ARM ARM.  The encoding space for
+  // LD_PCREL, LDST_RO, LDST_UI and LDST_UIMM cover prefetch ops.
+  static bool
+  aarch64_ld(Insntype insn) { return aarch64_bit(insn, 22) == 1; }
+
+  static bool
+  aarch64_ldst(Insntype insn)
+  { return (insn & 0x0a000000) == 0x08000000; }
+
+  static bool
+  aarch64_ldst_ex(Insntype insn)
+  { return (insn & 0x3f000000) == 0x08000000; }
+
+  static bool
+  aarch64_ldst_pcrel(Insntype insn)
+  { return (insn & 0x3b000000) == 0x18000000; }
+
+  static bool
+  aarch64_ldst_nap(Insntype insn)
+  { return (insn & 0x3b800000) == 0x28000000; }
+
+  static bool
+  aarch64_ldstp_pi(Insntype insn)
+  { return (insn & 0x3b800000) == 0x28800000; }
+
+  static bool
+  aarch64_ldstp_o(Insntype insn)
+  { return (insn & 0x3b800000) == 0x29000000; }
+
+  static bool
+  aarch64_ldstp_pre(Insntype insn)
+  { return (insn & 0x3b800000) == 0x29800000; }
+
+  static bool
+  aarch64_ldst_ui(Insntype insn)
+  { return (insn & 0x3b200c00) == 0x38000000; }
+
+  static bool
+  aarch64_ldst_piimm(Insntype insn)
+  { return (insn & 0x3b200c00) == 0x38000400; }
+
+  static bool
+  aarch64_ldst_u(Insntype insn)
+  { return (insn & 0x3b200c00) == 0x38000800; }
+
+  static bool
+  aarch64_ldst_preimm(Insntype insn)
+  { return (insn & 0x3b200c00) == 0x38000c00; }
+
+  static bool
+  aarch64_ldst_ro(Insntype insn)
+  { return (insn & 0x3b200c00) == 0x38200800; }
+
+  static bool
+  aarch64_ldst_uimm(Insntype insn)
+  { return (insn & 0x3b000000) == 0x39000000; }
+
+  static bool
+  aarch64_ldst_simd_m(Insntype insn)
+  { return (insn & 0xbfbf0000) == 0x0c000000; }
+
+  static bool
+  aarch64_ldst_simd_m_pi(Insntype insn)
+  { return (insn & 0xbfa00000) == 0x0c800000; }
+
+  static bool
+  aarch64_ldst_simd_s(Insntype insn)
+  { return (insn & 0xbf9f0000) == 0x0d000000; }
+
+  static bool
+  aarch64_ldst_simd_s_pi(Insntype insn)
+  { return (insn & 0xbf800000) == 0x0d800000; }
+
+  // Classify an INSN if it is indeed a load/store. Return true if INSN is a
+  // LD/ST instruction otherwise return false. For scalar LD/ST instructions
+  // PAIR is FALSE, RT is returned and RT2 is set equal to RT. For LD/ST pair
+  // instructions PAIR is TRUE, RT and RT2 are returned.
+  static bool
+  aarch64_mem_op_p(Insntype insn, unsigned int *rt, unsigned int *rt2,
+		   bool *pair, bool *load)
+  {
+    uint32_t opcode;
+    unsigned int r;
+    uint32_t opc = 0;
+    uint32_t v = 0;
+    uint32_t opc_v = 0;
+
+    /* Bail out quickly if INSN doesn't fall into the the load-store
+       encoding space.  */
+    if (!aarch64_ldst (insn))
+      return false;
+
+    *pair = false;
+    *load = false;
+    if (aarch64_ldst_ex (insn))
+      {
+	*rt = aarch64_rt (insn);
+	*rt2 = *rt;
+	if (aarch64_bit (insn, 21) == 1)
+	  {
+	    *pair = true;
+	    *rt2 = aarch64_rt2 (insn);
+	  }
+	*load = aarch64_ld (insn);
+	return true;
+      }
+    else if (aarch64_ldst_nap (insn)
+	     || aarch64_ldstp_pi (insn)
+	     || aarch64_ldstp_o (insn)
+	     || aarch64_ldstp_pre (insn))
+      {
+	*pair = true;
+	*rt = aarch64_rt (insn);
+	*rt2 = aarch64_rt2 (insn);
+	*load = aarch64_ld (insn);
+	return true;
+      }
+    else if (aarch64_ldst_pcrel (insn)
+	     || aarch64_ldst_ui (insn)
+	     || aarch64_ldst_piimm (insn)
+	     || aarch64_ldst_u (insn)
+	     || aarch64_ldst_preimm (insn)
+	     || aarch64_ldst_ro (insn)
+	     || aarch64_ldst_uimm (insn))
+      {
+	*rt = aarch64_rt (insn);
+	*rt2 = *rt;
+	if (aarch64_ldst_pcrel (insn))
+	  *load = true;
+	opc = aarch64_bits (insn, 22, 2);
+	v = aarch64_bit (insn, 26);
+	opc_v = opc | (v << 2);
+	*load =  (opc_v == 1 || opc_v == 2 || opc_v == 3
+		  || opc_v == 5 || opc_v == 7);
+	return true;
+      }
+    else if (aarch64_ldst_simd_m (insn)
+	     || aarch64_ldst_simd_m_pi (insn))
+      {
+	*rt = aarch64_rt (insn);
+	*load = aarch64_bit (insn, 22);
+	opcode = (insn >> 12) & 0xf;
+	switch (opcode)
+	  {
+	  case 0:
+	  case 2:
+	    *rt2 = *rt + 3;
+	    break;
+
+	  case 4:
+	  case 6:
+	    *rt2 = *rt + 2;
+	    break;
+
+	  case 7:
+	    *rt2 = *rt;
+	    break;
+
+	  case 8:
+	  case 10:
+	    *rt2 = *rt + 1;
+	    break;
+
+	  default:
+	    return false;
+	  }
+	return true;
+      }
+    else if (aarch64_ldst_simd_s (insn)
+	     || aarch64_ldst_simd_s_pi (insn))
+      {
+	*rt = aarch64_rt (insn);
+	r = (insn >> 21) & 1;
+	*load = aarch64_bit (insn, 22);
+	opcode = (insn >> 13) & 0x7;
+	switch (opcode)
+	  {
+	  case 0:
+	  case 2:
+	  case 4:
+	    *rt2 = *rt + r;
+	    break;
+
+	  case 1:
+	  case 3:
+	  case 5:
+	    *rt2 = *rt + (r == 0 ? 2 : 3);
+	    break;
+
+	  case 6:
+	    *rt2 = *rt + r;
+	    break;
+
+	  case 7:
+	    *rt2 = *rt + (r == 0 ? 2 : 3);
+	    break;
+
+	  default:
+	    return false;
+	  }
+	return true;
+      }
+    return false;
+  }
+};
+
+
 // Output_data_got_aarch64 class.
 
 template<int size, bool big_endian>
 class Output_data_got_aarch64 : public Output_data_got<size, big_endian>
 {
@@ -914,11 +1182,18 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
   {
     gold_assert(shndx < this->stub_tables_.size());
     this->stub_tables_[shndx] = stub_table;
   }
 
- // Scan all relocation sections for stub generation.
+  // Entrance to erratum_843419 scanning.
+  void
+  scan_erratum_843419(unsigned int shndx,
+		      const elfcpp::Shdr<size, big_endian>&,
+		      Output_section*, const Symbol_table*,
+		      The_target_aarch64*);
+
+  // Scan all relocation sections for stub generation.
   void
   scan_sections_for_stubs(The_target_aarch64*, const Symbol_table*,
 			  const Layout*);
 
   // Whether a section is a scannable text section.
@@ -933,10 +1208,34 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
     // The stubs have relocations and we need to process them after writing
     // out the stubs.  So relocation now must follow section write.
     this->set_relocs_must_follow_section_writes();
   }
 
+  // Structure for mapping symbol position.
+  struct Mapping_symbol_position
+  {
+    Mapping_symbol_position(unsigned int shndx, AArch64_address offset):
+      shndx_(shndx), offset_(offset)
+    {}
+
+    // "<" comparator used in ordered_map container.
+    bool
+    operator<(const Mapping_symbol_position& p) const
+    {
+      return (this->shndx_ < p.shndx_
+	      || (this->shndx_ == p.shndx_ && this->offset_ < p.offset_));
+    }
+
+    // Section index.
+    unsigned int shndx_;
+
+    // Section offset.
+    AArch64_address offset_;
+  };
+
+  typedef std::map<Mapping_symbol_position, char> Mapping_symbol_info;
+
  protected:
   // Post constructor setup.
   void
   do_setup()
   {
@@ -951,22 +1250,111 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
   do_relocate_sections(
       const Symbol_table* symtab, const Layout* layout,
       const unsigned char* pshdrs, Output_file* of,
       typename Sized_relobj_file<size, big_endian>::Views* pviews);
 
+  // Count local symbols and (optionally) record mapping info.
+  virtual void
+  do_count_local_symbols(Stringpool_template<char>*,
+			 Stringpool_template<char>*);
+
  private:
   // Whether a section needs to be scanned for relocation stubs.
   bool
   section_needs_reloc_stub_scanning(const elfcpp::Shdr<size, big_endian>&,
 				    const Relobj::Output_sections&,
 				    const Symbol_table*, const unsigned char*);
 
   // List of stub tables.
   Stub_table_list stub_tables_;
+
+  // Mapping symbol information sorted by (section index, section_offset).
+  Mapping_symbol_info mapping_symbol_info_;
 };  // End of AArch64_relobj
 
 
+// Override to record mapping symbol information.
+template<int size, bool big_endian>
+void
+AArch64_relobj<size, big_endian>::do_count_local_symbols(
+    Stringpool_template<char>* pool, Stringpool_template<char>* dynpool)
+{
+  Sized_relobj_file<size, big_endian>::do_count_local_symbols(pool, dynpool);
+
+  // Only erratum-fixing work needs mapping symbols, so skip this time consuming
+  // processing if not fixing erratum.
+  if (!parameters->options().fix_cortex_a53())
+    return;
+
+  const unsigned int loccount = this->local_symbol_count();
+  if (loccount == 0)
+    return;
+
+  // Read the symbol table section header.
+  const unsigned int symtab_shndx = this->symtab_shndx();
+  elfcpp::Shdr<size, big_endian>
+      symtabshdr(this, this->elf_file()->section_header(symtab_shndx));
+  gold_assert(symtabshdr.get_sh_type() == elfcpp::SHT_SYMTAB);
+
+  // Read the local symbols.
+  const int sym_size =elfcpp::Elf_sizes<size>::sym_size;
+  gold_assert(loccount == symtabshdr.get_sh_info());
+  off_t locsize = loccount * sym_size;
+  const unsigned char* psyms = this->get_view(symtabshdr.get_sh_offset(),
+					      locsize, true, true);
+
+  // For mapping symbol processing, we need to read the symbol names.
+  unsigned int strtab_shndx = this->adjust_shndx(symtabshdr.get_sh_link());
+  if (strtab_shndx >= this->shnum())
+    {
+      this->error(_("invalid symbol table name index: %u"), strtab_shndx);
+      return;
+    }
+
+  elfcpp::Shdr<size, big_endian>
+    strtabshdr(this, this->elf_file()->section_header(strtab_shndx));
+  if (strtabshdr.get_sh_type() != elfcpp::SHT_STRTAB)
+    {
+      this->error(_("symbol table name section has wrong type: %u"),
+		  static_cast<unsigned int>(strtabshdr.get_sh_type()));
+      return;
+    }
+
+  const char* pnames =
+    reinterpret_cast<const char*>(this->get_view(strtabshdr.get_sh_offset(),
+						 strtabshdr.get_sh_size(),
+						 false, false));
+
+  // Skip the first dummy symbol.
+  psyms += sym_size;
+  typename Sized_relobj_file<size, big_endian>::Local_values*
+    plocal_values = this->local_values();
+  for (unsigned int i = 1; i < loccount; ++i, psyms += sym_size)
+    {
+      elfcpp::Sym<size, big_endian> sym(psyms);
+      Symbol_value<size>& lv((*plocal_values)[i]);
+      AArch64_address input_value = lv.input_value();
+
+      // Check to see if this is a mapping symbol.
+      const char* sym_name = pnames + sym.get_st_name();
+      if (sym_name[0] == '$' && (sym_name[1] == 'x' || sym_name[1] == 'd')
+	  && sym_name[2] == '\0')
+	{
+	  bool is_ordinary;
+	  unsigned int input_shndx =
+	    this->adjust_sym_shndx(i, sym.get_st_shndx(), &is_ordinary);
+	  gold_assert(is_ordinary);
+
+	  Mapping_symbol_position msp(input_shndx, input_value);
+	  // Insert mapping_symbol_info into map whose ordering is defined by
+	  // (shndx, offset_within_section).
+	  this->mapping_symbol_info_[msp] = sym_name[1];
+	}
+   }
+}
+
+
 // Relocate sections.
 
 template<int size, bool big_endian>
 void
 AArch64_relobj<size, big_endian>::do_relocate_sections(
@@ -1103,10 +1491,74 @@ AArch64_relobj<size, big_endian>::section_needs_reloc_stub_scanning(
   return this->text_section_is_scannable(text_shdr, text_shndx,
 					 out_sections[text_shndx], symtab);
 }
 
 
+// Scan section SHNDX for erratum 843419.
+
+template<int size, bool big_endian>
+void
+AArch64_relobj<size, big_endian>::scan_erratum_843419(
+    unsigned int shndx, const elfcpp::Shdr<size, big_endian>& shdr,
+    Output_section* os, const Symbol_table* symtab,
+    The_target_aarch64* target)
+{
+  if (shdr.get_sh_size() == 0
+      || (shdr.get_sh_flags() &
+	  (elfcpp::SHF_ALLOC | elfcpp::SHF_EXECINSTR)) == 0
+      || shdr.get_sh_type() != elfcpp::SHT_PROGBITS)
+    return;
+
+  if (!os || symtab->is_section_folded(this, shndx)) return;
+
+  AArch64_address output_offset = this->get_output_section_offset(shndx);
+  AArch64_address output_address;
+  if (output_offset != invalid_address)
+    output_address = os->address() + output_offset;
+  else
+    {
+      const Output_relaxed_input_section* poris =
+	os->find_relaxed_input_section(this, shndx);
+      if (!poris) return;
+      output_address = poris->address();
+    }
+
+  section_size_type input_view_size = 0;
+  const unsigned char* input_view =
+    this->section_contents(shndx, &input_view_size, false);
+
+  Mapping_symbol_position section_start(shndx, 0);
+  // Find the first mapping symbol record within section shndx.
+  typename Mapping_symbol_info::const_iterator p =
+    this->mapping_symbol_info_.lower_bound(section_start);
+  if (p == this->mapping_symbol_info_.end() || p->first.shndx_ != shndx)
+    gold_warning(_("cannot scan executable section %u of %s for Cortex-A53 "
+		   "erratum because it has no mapping symbols."),
+		 shndx, this->name().c_str());
+  while (p != this->mapping_symbol_info_.end() &&
+	 p->first.shndx_ == shndx)
+    {
+      typename Mapping_symbol_info::const_iterator prev = p;
+      ++p;
+      if (prev->second == 'x')
+	{
+	  section_size_type span_start =
+	    convert_to_section_size_type(prev->first.offset_);
+	  section_size_type span_end;
+	  if (p != this->mapping_symbol_info_.end()
+	      && p->first.shndx_ == shndx)
+	    span_end = convert_to_section_size_type(p->first.offset_);
+	  else
+	    span_end = convert_to_section_size_type(shdr.get_sh_size());
+	  target->scan_erratum_843419_span(
+	      this, shndx, span_start, span_end,
+	      const_cast<unsigned char*>(input_view), output_address);
+	}
+    }
+}
+
+
 // Scan relocations for stub generation.
 
 template<int size, bool big_endian>
 void
 AArch64_relobj<size, big_endian>::scan_sections_for_stubs(
@@ -1136,10 +1588,12 @@ AArch64_relobj<size, big_endian>::scan_sections_for_stubs(
   // Do relocation stubs scanning.
   const unsigned char* p = pshdrs + shdr_size;
   for (unsigned int i = 1; i < shnum; ++i, p += shdr_size)
     {
       const elfcpp::Shdr<size, big_endian> shdr(p);
+      if (parameters->options().fix_cortex_a53())
+	scan_erratum_843419(i, shdr, out_sections[i], symtab, target);
       if (this->section_needs_reloc_stub_scanning(shdr, out_sections, symtab,
 						  pshdrs))
 	{
 	  unsigned int index = this->adjust_shndx(shdr.get_sh_info());
 	  AArch64_address output_offset =
@@ -1642,10 +2096,11 @@ class Target_aarch64 : public Sized_target<size, big_endian>
   typedef AArch64_input_section<size, big_endian> The_aarch64_input_section;
   typedef AArch64_output_section<size, big_endian> The_aarch64_output_section;
   typedef Unordered_map<Section_id,
 			AArch64_input_section<size, big_endian>*,
 			Section_id_hash> AArch64_input_section_map;
+  typedef AArch64_insn_utilities<big_endian> Insn_utilities;
   const static int TCB_SIZE = size / 8 * 2;
 
   Target_aarch64(const Target::Target_info* info = &aarch64_info)
     : Sized_target<size, big_endian>(info),
       got_(NULL), plt_(NULL), got_plt_(NULL), got_irelative_(NULL),
@@ -1833,10 +2288,21 @@ class Target_aarch64 : public Sized_target<size, big_endian>
 		&& parameters->target().get_size() == size
 		&& parameters->target().is_big_endian() == big_endian);
     return static_cast<This*>(parameters->sized_target<size, big_endian>());
   }
 
+
+  // Scan erratum for a part of a section.
+  void
+  scan_erratum_843419_span(
+    AArch64_relobj<size, big_endian>*,
+    unsigned int shndx,
+    const section_size_type,
+    const section_size_type,
+    unsigned char*,
+    Address);
+
  protected:
   void
   do_select_as_default_target()
   {
     gold_assert(aarch64_reloc_property_table == NULL);
@@ -2126,10 +2592,16 @@ class Target_aarch64 : public Sized_target<size, big_endian>
   {
     gold_assert(this->plt_ != NULL);
     return this->plt_;
   }
 
+  // Return whether this is a 3-insn erratum sequence.
+  bool is_erratum_843419_sequence(
+      typename elfcpp::Swap<32,big_endian>::Valtype insn1,
+      typename elfcpp::Swap<32,big_endian>::Valtype insn2,
+      typename elfcpp::Swap<32,big_endian>::Valtype insn3);
+
   // Get the dynamic reloc section, creating it if necessary.
   Reloc_section*
   rela_dyn_section(Layout*);
 
   // Get the section to use for TLSDESC relocations.
@@ -2557,11 +3029,11 @@ Target_aarch64<size, big_endian>::scan_reloc_for_stub(
     }
 
   typename The_reloc_stub::Stub_type stub_type = The_reloc_stub::
       stub_type_for_reloc(r_type, address, destination);
   if (stub_type == The_reloc_stub::ST_NONE)
-    return ;
+    return;
 
   The_stub_table* stub_table = aarch64_relobj->stub_table(relinfo->data_shndx);
   gold_assert(stub_table != NULL);
 
   The_reloc_stub_key key(stub_type, gsym, aarch64_relobj, r_sym, addend);
@@ -6717,10 +7189,116 @@ Target_aarch64<size, big_endian>::relocate_relocs(
     reloc_view,
     reloc_view_size);
 }
 
 
+// Return whether this is a 3-insn erratum sequence.
+
+template<int size, bool big_endian>
+bool
+Target_aarch64<size, big_endian>::is_erratum_843419_sequence(
+    typename elfcpp::Swap<32,big_endian>::Valtype insn1,
+    typename elfcpp::Swap<32,big_endian>::Valtype insn2,
+    typename elfcpp::Swap<32,big_endian>::Valtype insn3)
+{
+  unsigned rt1, rt2;
+  bool load, pair;
+
+  // The 2nd insn is a single register load or store; or register pair
+  // store.
+  if (Insn_utilities::aarch64_mem_op_p(insn2, &rt1, &rt2, &pair, &load)
+      && (!pair || (pair && !load)))
+    {
+      // The 3rd insn is a load or store instruction from the "Load/store
+      // register (unsigned immediate)" encoding class, using Rn as the
+      // base address register.
+      if (Insn_utilities::aarch64_ldst_uimm(insn3)
+	  && (Insn_utilities::aarch64_rn(insn3)
+	      == Insn_utilities::aarch64_rd(insn1)))
+	return true;
+    }
+  return false;
+}
+
+
+// Scan erratum for section SHNDX range
+// [output_address + span_start, output_address + span_end).
+
+template<int size, bool big_endian>
+void
+Target_aarch64<size, big_endian>::scan_erratum_843419_span(
+    AArch64_relobj<size, big_endian>*  relobj,
+    unsigned int shndx,
+    const section_size_type span_start,
+    const section_size_type span_end,
+    unsigned char* input_view,
+    Address output_address)
+{
+  typedef typename Insn_utilities::Insntype Insntype;
+
+  if (((output_address + span_start) & 0x03) != 0)
+    return;
+
+  section_size_type offset = 0;
+  section_size_type span_length = span_end - span_start;
+  // The first instruction must be ending at 0xFF8 or 0xFFC.
+  unsigned int page_offset = (output_address + span_start) & 0xFFF;
+  // Make sure starting position, that is "output_address+span_start+offset",
+  // starts at page position 0xFF8 or 0xFFC.
+  if (page_offset < 0xFF8)
+    offset = 0xFF8 - page_offset;
+  while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
+    {
+      Insntype* ip = reinterpret_cast<Insntype*>(input_view +
+						 span_start + offset);
+      Insntype insn1 = ip[0];
+      if (Insn_utilities::is_adrp(insn1))
+	{
+	  Insntype insn2 = ip[1];
+	  Insntype insn3 = ip[2];
+	  bool do_report = false;
+	  if (is_erratum_843419_sequence(insn1, insn2, insn3))
+	    do_report = true;
+	  else if (offset + 4 * Insn_utilities::BYTES_PER_INSN <= span_length)
+	    {
+	      // Optionally we can have an insn between ins2 and ins3
+	      Insntype insn_opt = ip[2];
+	      // And insn_opt must not be a branch.
+	      if (!Insn_utilities::aarch64_b(insn_opt)
+		  && !Insn_utilities::aarch64_bl(insn_opt)
+		  && !Insn_utilities::aarch64_blr(insn_opt)
+		  && !Insn_utilities::aarch64_br(insn_opt))
+		{
+		  // And insn_opt must not write to dest reg in insn1. However
+		  // we do a conservative scan, which means we may fix/report
+		  // more than necessary, but it doesn't hurt.
+
+		  Insntype insn4 = ip[3];
+		  if (is_erratum_843419_sequence(insn1, insn2, insn4))
+		    do_report = true;
+		}
+	    }
+	  if (do_report)
+	    {
+	      gold_error(_("Erratum 943419 found at \"%s\", section %d, "
+			   "offset 0x%08x."),
+			 relobj->name().c_str(), shndx,
+			 (unsigned int)(span_start + offset));
+	    }
+	}
+
+      // Advance to next candidate instruction. We only consider instruction
+      // sequences starting at a page offset of 0xff8 or 0xffc.
+      page_offset = (output_address + span_start + offset) & 0XFFF;
+      if (page_offset == 0XFF8)
+	offset += 4;
+      else  // (page_offset == 0xffc), we move to next page's 0xff8.
+	offset += 0XFFC;
+    }
+}
+
+
 // The selector for aarch64 object files.
 
 template<int size, bool big_endian>
 class Target_selector_aarch64 : public Target_selector
 {
diff --git a/gold/options.h b/gold/options.h
index 8c9c934..c1a5743 100644
--- a/gold/options.h
+++ b/gold/options.h
@@ -800,10 +800,14 @@ class General_options
 
   DEFINE_bool(fix_cortex_a8, options::TWO_DASHES, '\0', false,
 	      N_("(ARM only) Fix binaries for Cortex-A8 erratum."),
 	      N_("(ARM only) Do not fix binaries for Cortex-A8 erratum."));
 
+  DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
+	      N_("(AArch64 only) Scan and fix binaries for Cortex-A53 errata."),
+	      N_("(AArch64 only) Do not scan for Cortex-A53 errata."));
+
   DEFINE_bool(fix_arm1176, options::TWO_DASHES, '\0', true,
 	      N_("(ARM only) Fix binaries for ARM1176 erratum."),
 	      N_("(ARM only) Do not fix binaries for ARM1176 erratum."));
 
   DEFINE_bool(merge_exidx_entries, options::TWO_DASHES, '\0', true,
diff --git a/gold/aarch64.cc b/gold/aarch64.cc
index 2313496..f0b7ca8 100644
--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -1217,11 +1217,11 @@ class AArch64_relobj : public Sized_relobj_file<size, big_endian>
       shndx_(shndx), offset_(offset)
     {}
 
     // "<" comparator used in ordered_map container.
     bool
-    operator < (const Mapping_symbol_position& p) const
+    operator<(const Mapping_symbol_position& p) const
     {
       return (this->shndx_ < p.shndx_
 	      || (this->shndx_ == p.shndx_ && this->offset_ < p.offset_));
     }
 
@@ -1281,11 +1281,11 @@ AArch64_relobj<size, big_endian>::do_count_local_symbols(
   Sized_relobj_file<size, big_endian>::do_count_local_symbols(pool, dynpool);
 
   // Only erratum-fixing work needs mapping symbols, so skip this time consuming
   // processing if not fixing erratum.
   if (!parameters->options().fix_cortex_a53())
-    return ;
+    return;
 
   const unsigned int loccount = this->local_symbol_count();
   if (loccount == 0)
     return;
 
@@ -1535,27 +1535,26 @@ AArch64_relobj<size, big_endian>::scan_erratum_843419(
 		   "erratum because it has no mapping symbols."),
 		 shndx, this->name().c_str());
   while (p != this->mapping_symbol_info_.end() &&
 	 p->first.shndx_ == shndx)
     {
-      typename Mapping_symbol_info::const_iterator next =
-	this->mapping_symbol_info_.upper_bound(p->first);
-      if (p->second == 'x')
+      typename Mapping_symbol_info::const_iterator prev = p;
+      ++p;
+      if (prev->second == 'x')
 	{
 	  section_size_type span_start =
-	    convert_to_section_size_type(p->first.offset_);
+	    convert_to_section_size_type(prev->first.offset_);
 	  section_size_type span_end;
-	  if (next != this->mapping_symbol_info_.end()
-	      && next->first.shndx_ == shndx)
-	    span_end = convert_to_section_size_type(next->first.offset_);
+	  if (p != this->mapping_symbol_info_.end()
+	      && p->first.shndx_ == shndx)
+	    span_end = convert_to_section_size_type(p->first.offset_);
 	  else
 	    span_end = convert_to_section_size_type(shdr.get_sh_size());
 	  target->scan_erratum_843419_span(
 	      this, shndx, span_start, span_end,
 	      const_cast<unsigned char*>(input_view), output_address);
 	}
-      p = next;
     }
 }
 
 
 // Scan relocations for stub generation.
@@ -3030,11 +3029,11 @@ Target_aarch64<size, big_endian>::scan_reloc_for_stub(
     }
 
   typename The_reloc_stub::Stub_type stub_type = The_reloc_stub::
       stub_type_for_reloc(r_type, address, destination);
   if (stub_type == The_reloc_stub::ST_NONE)
-    return ;
+    return;
 
   The_stub_table* stub_table = aarch64_relobj->stub_table(relinfo->data_shndx);
   gold_assert(stub_table != NULL);
 
   The_reloc_stub_key key(stub_type, gsym, aarch64_relobj, r_sym, addend);
@@ -7210,13 +7209,13 @@ Target_aarch64<size, big_endian>::is_erratum_843419_sequence(
       && (!pair || (pair && !load)))
     {
       // The 3rd insn is a load or store instruction from the "Load/store
       // register (unsigned immediate)" encoding class, using Rn as the
       // base address register.
-      if (Insn_utilities::aarch64_ldst_uimm(insn3) &&
-	  Insn_utilities::aarch64_rn(insn3)
-	  == Insn_utilities::aarch64_rd(insn1))
+      if (Insn_utilities::aarch64_ldst_uimm(insn3)
+	  && (Insn_utilities::aarch64_rn(insn3)
+	      == Insn_utilities::aarch64_rd(insn1)))
 	return true;
     }
   return false;
 }
 
@@ -7234,63 +7233,69 @@ Target_aarch64<size, big_endian>::scan_erratum_843419_span(
     unsigned char* input_view,
     Address output_address)
 {
   typedef typename Insn_utilities::Insntype Insntype;
 
-  // Erratum sequence is at least 3 insns.
-  if (span_end - span_start < 3 * Insn_utilities::BYTES_PER_INSN)
-    return ;
+  if (((output_address + span_start) & 0x03) != 0)
+    return;
 
-  Insntype* ip = reinterpret_cast<Insntype*>(input_view + span_start);
-  unsigned int insn_index = 0;
-  for (section_size_type i = span_start; i < span_end;
-       i += Insn_utilities::BYTES_PER_INSN, ++insn_index)
+  section_size_type offset = 0;
+  section_size_type span_length = span_end - span_start;
+  // The first instruction must be ending at 0xFF8 or 0xFFC.
+  unsigned int page_offset = (output_address + span_start) & 0xFFF;
+  // Make sure starting position, that is "output_address+span_start+offset",
+  // starts at page position 0xFF8 or 0xFFC.
+  if (page_offset < 0xFF8)
+    offset = 0xFF8 - page_offset;
+  while (offset + 3 * Insn_utilities::BYTES_PER_INSN <= span_length)
     {
-      if (span_end - i < 3 * Insn_utilities::BYTES_PER_INSN)
-	break;
-      Address address = output_address + i;
-      // The first instruction must be ending at 0xFF8 or 0xFFC and it must
-      // be an ADRP.
-      Address a3 = (address & 0xFFF);
-      if (a3 != 0xFF8 && a3 != 0xFFC)
-	continue;
-      Insntype insn1 = ip[insn_index];
-      if (!Insn_utilities::is_adrp(insn1))
-	continue;
-
-      Insntype insn2 = ip[insn_index + 1];
-      Insntype insn3 = ip[insn_index + 2];
-      bool do_report = false;
-      if (is_erratum_843419_sequence(insn1, insn2, insn3))
-	do_report = true;
-      else if (span_end - i > 16)
+      Insntype* ip = reinterpret_cast<Insntype*>(input_view +
+						 span_start + offset);
+      Insntype insn1 = ip[0];
+      if (Insn_utilities::is_adrp(insn1))
 	{
-	  // Optionally we can have an insn between ins2 and ins3
-	  Insntype insn_opt = ip[insn_index + 2];
-	  // And insn_opt must not be a branch.
-	  if (Insn_utilities::aarch64_b(insn_opt)
-	      || Insn_utilities::aarch64_bl(insn_opt)
-	      || Insn_utilities::aarch64_blr(insn_opt)
-	      || Insn_utilities::aarch64_br(insn_opt))
-	    continue;
-
-	  // And insn_opt must not write to dest reg in insn1. However we do a
-	  // conservative scan, which means we may fix/report more than
-	  // necessary, but it doesn't hurt.
-
-	  Insntype insn4 = ip[insn_index + 3];
-	  if (is_erratum_843419_sequence(insn1, insn2, insn4))
+	  Insntype insn2 = ip[1];
+	  Insntype insn3 = ip[2];
+	  bool do_report = false;
+	  if (is_erratum_843419_sequence(insn1, insn2, insn3))
 	    do_report = true;
+	  else if (offset + 4 * Insn_utilities::BYTES_PER_INSN <= span_length)
+	    {
+	      // Optionally we can have an insn between ins2 and ins3
+	      Insntype insn_opt = ip[2];
+	      // And insn_opt must not be a branch.
+	      if (!Insn_utilities::aarch64_b(insn_opt)
+		  && !Insn_utilities::aarch64_bl(insn_opt)
+		  && !Insn_utilities::aarch64_blr(insn_opt)
+		  && !Insn_utilities::aarch64_br(insn_opt))
+		{
+		  // And insn_opt must not write to dest reg in insn1. However
+		  // we do a conservative scan, which means we may fix/report
+		  // more than necessary, but it doesn't hurt.
+
+		  Insntype insn4 = ip[3];
+		  if (is_erratum_843419_sequence(insn1, insn2, insn4))
+		    do_report = true;
+		}
+	    }
+	  if (do_report)
+	    {
+	      gold_error(_("Erratum 943419 found at \"%s\", section %d, "
+			   "offset 0x%08x."),
+			 relobj->name().c_str(), shndx,
+			 (unsigned int)(span_start + offset));
+	    }
 	}
-      if (do_report)
-	{
-	  gold_error(_("Erratum 943419 found at \"%s\", section %d, "
-		       "offset 0x%08x."),
-		     relobj->name().c_str(), shndx, (unsigned int)i);
-	}
+
+      // Advance to next candidate instruction. We only consider instruction
+      // sequences starting at a page offset of 0xff8 or 0xffc.
+      page_offset = (output_address + span_start + offset) & 0XFFF;
+      if (page_offset == 0XFF8)
+	offset += 4;
+      else  // (page_offset == 0xffc), we move to next page's 0xff8.
+	offset += 0XFFC;
     }
-  return ;
 }
 
 
 // The selector for aarch64 object files.
 
diff --git a/gold/options.h b/gold/options.h
index 6bf989c..c1a5743 100644
--- a/gold/options.h
+++ b/gold/options.h
@@ -801,12 +801,12 @@ class General_options
   DEFINE_bool(fix_cortex_a8, options::TWO_DASHES, '\0', false,
 	      N_("(ARM only) Fix binaries for Cortex-A8 erratum."),
 	      N_("(ARM only) Do not fix binaries for Cortex-A8 erratum."));
 
   DEFINE_bool(fix_cortex_a53, options::TWO_DASHES, '\0', false,
-	      N_("(AArch64 only) Scan and fix binaries for Cortex-A53 erratums."),
-	      N_("(AArch64 only) Do not scan for Cortex-A53 erratums."));
+	      N_("(AArch64 only) Scan and fix binaries for Cortex-A53 errata."),
+	      N_("(AArch64 only) Do not scan for Cortex-A53 errata."));
 
   DEFINE_bool(fix_arm1176, options::TWO_DASHES, '\0', true,
 	      N_("(ARM only) Fix binaries for ARM1176 erratum."),
 	      N_("(ARM only) Do not fix binaries for ARM1176 erratum."));
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]