This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gold][aarch64]Patch for Relaxation


Hi Cary, thanks. Attached the updated patch.
The only diff from patch2 is the "hash_value" functionï which I
directly pasted below for ease of review.

--- a/gold/aarch64.cc
+++ b/gold/aarch64.cc
@@ -496,26 +496,20 @@ class Reloc_stub

     // Return a hash value.
     size_t
     hash_value() const
     {
-      // This will occupy the lowest 20 bits in the final hash value.
-      size_t name_hash_value = 0XFFFFFF & gold::string_hash<char>(
+      size_t name_hash_value = gold::string_hash<char>(
          (this->r_sym_ != Reloc_stub::invalid_index)
          ? this->u_.relobj->name().c_str()
          : this->u_.symbol->name());
-      // We only have 4 stub types, so we allocate 2 bits for it in the final
-      // hash value, and they will occupy bit 21 - 22.
+      // We only have 4 stub types.
       size_t stub_type_hash_value = 0x03 & this->stub_type_;
-      // 5 bits for r_sym_, even if it's a invalid_index.
-      size_t r_sym_hash_value = this->r_sym_ & 0x1F;
-      // 5 bits for addend_.
-      size_t addend_hash_value = this->addend_ & 0x1F;
-      return name_hash_value
-       | (stub_type_hash_value << 20)
-       | (r_sym_hash_value << 22)
-       | (addend_hash_value << 27);
+      return (name_hash_value
+             ^ stub_type_hash_value
+             ^ ((this->r_sym_ & 0x3fff) << 2)
+             ^ ((this->addend_ & 0xffff) << 16));
     }

     // Functors for STL associative containers.
     struct hash
     {


On Wed, Oct 15, 2014 at 9:59 AM, Cary Coutant <ccoutant@google.com> wrote:
> +      return name_hash_value
> +       | (stub_type_hash_value << 20)
> +       | (r_sym_hash_value << 22)
> +       | (addend_hash_value << 27);
>
> Please add parentheses around the whole expression, then line up the
> "|" operators just inside the open paren (standard Gnu style for
> continued expressions).
>
> My concern with the previous hash function wasn't about XOR'ing the
> additional bits into the string hash, but with the fact that, e.g.,
> r_sym, stub_type, and addend were all effectively XOR'ed together
> without any multiplicative factor to keep them from canceling one
> another out. Your new function avoids that, but drops all but the
> lower 5 bits of addend and r_sym. I'm not sure what distribution
> you're expecting, so it's not immediately clear to me how good this
> hash is. If you go back to XOR, there's no need to mask or shift away
> any of the bits of the string hash -- just do something to keep the
> three other fields from canceling. One way is to mask and shift them
> into non-overlapping positions, as you've done here, but you can keep
> more of the bits; something like this:
>
>     size_t name_hash_value = gold::string_hash<char>(...);
>     return (name_hash_value
>             ^ stub_type_hash_value
>             ^ ((r_sym & 0x3fff) << 2)
>             ^ ((addend & 0xffff) << 16));

I choose to use this straightforward and faster implementation.
>
> A more robust approach would be to use a hash combine function.
> Unfortunately, I don't think STL provides one, but Boost has
> hash_combine(), and libiberty has an equivalent, iterative_hash()
> (#include "hashtab.h"). You could use iterative_hash something like
> this:
>
>     hashval_t h = static_cast<hashval_t>(gold::string_hash<char>(...);
>     h = iterative_hash_object(this->stub_type, h);
>     h = iterative_hash_object(r_sym, h);
>     h = iterative_hash_object(addend, h);
>     return static_cast<size_t>(h);
>
> It's only a 32-bit hash function, but that's probably sufficient. If
> you don't want to throw away the full string_hash value, you could do
> this instead:
>
>     hashval_t h = 0;
>     h = iterative_hash_object(this->stub_type, h);
>     h = iterative_hash_object(r_sym, h);
>     h = iterative_hash_object(addend, h);
>     return (gold::string_hash<char>(...)
>              ^ static_cast<size_t>(h);
>
> Using iterative_hash might be noticeably slower than the simpler XOR
> strategy, so I'm OK with either.
>
> -cary
>
>
> On Mon, Oct 13, 2014 at 1:31 PM, HÃn ShÄn (ææ) <shenhan@google.com> wrote:
>> Hi Cary, thanks. All done in below.
>>
>> Attached modified patch - relaxation.patch2. Also attached the diff
>> from patch2 to patch1 for ease of review.
>>
>> Thanks,
>> Han
>>
>> On Fri, Oct 10, 2014 at 4:28 PM, Cary Coutant <ccoutant@google.com> wrote:
>>>
>>> > Here we have the patch for gold aarch64 backend to support relaxation.
>>> >
>>> > In short relaxation is the linker's generation of stubs that fixes the
>>> > out-of-range jumps/branches in the original object file.
>>> >
>>> > With this implementation, we are able to link a 456MB aarch64 application
>>> > (correctness of the result file, though, hasn't been verified.)
>>>
>>> Thanks. Here are my comments...
>>>
>>> -cary
>>>
>>>
>>> +      struct
>>> +      {
>>> + // For a global symbol, the symbol itself.
>>> + Symbol* symbol;
>>> +      } global;
>>> +      struct
>>> +      {
>>> + // For a local symbol, the object defining object.
>>>
>>> I know this is actually unchanged code from before, but this
>>> comment looks wrong. Should probably be "the defining object"
>>> or "the object defining the symbol".
>>
>>
>> Done
>>>
>>>
>>> +  // Do not change the value of the enums, they are used to index into
>>> +  // stub_insns array.
>>>
>>> In that case, you should probably assign values explicitly to each
>>> enum constant. Enum constants like this should generally be all
>>> upper case (like macros and static constants).
>>
>>
>> Done - Changed to all upper case and explicitly assigned values to them.
>>>
>>>
>>> +  // Determine the stub type for a certain relocation or st_none, if no stub is
>>> +  // needed.
>>> +  static Stub_type
>>> +  stub_type_for_reloc(unsigned int r_type, AArch64_address address,
>>> +      AArch64_address target);
>>> +
>>> + public:
>>>
>>> Unnecessary: everything above this was also public.
>>
>>
>> Done
>>>
>>>
>>> +  // The key class used to index the stub instance in the stub
>>> table's stub map.
>>> +  class Key
>>> +  {
>>> +  public:
>>>
>>> "public" should be indented one space.
>>
>>
>> Done
>>>
>>>
>>> +    // Return a hash value.
>>> +    size_t
>>> +    hash_value() const
>>> +    {
>>> +      return (this->stub_type_
>>> +      ^ this->r_sym_
>>> +      ^ gold::string_hash<char>(
>>> +    (this->r_sym_ != Reloc_stub::invalid_index)
>>> +    ? this->u_.relobj->name().c_str()
>>> +    : this->u_.symbol->name())
>>> +      ^ this->addend_);
>>>
>>> I know this is just copied from arm.cc, but it looks to me like
>>> this hash function might be a bit weak. In particular,
>>> stub_type_, r_sym_, and addend_ might easily cancel one another
>>> out, causing some preventable collisions. r_sym_ is also
>>> unnecessary (but harmless, I guess), when it's equal to
>>> invalid_index. I'd suggest at least shifting stub_type_, r_sym_,
>>> and addend_ by different amounts, or applying a different
>>> multiplier to each to distribute the bits around more, and
>>> applying r_sym_ only when it's a local symbol index.
>>
>>
>> Done by placing name hash at bit 0-19, stub_type 20-21, r_sym 22-26
>> (even if it's a invalid_index) and addend 27-31.
>>>
>>>
>>> +    struct equal_to
>>> +    {
>>> +      bool
>>> +      operator()(const Key& k1, const Key& k2) const
>>> +      { return k1.eq(k2); }
>>> +    };
>>> +
>>> +  private:
>>>
>>> "private" should be indented one space.
>>
>>
>> Done.
>>>
>>>
>>> +    // Initialize look-up tables.
>>> +    Stub_table_list empty_stub_table_list(this->shnum(), NULL);
>>> +    this->stub_tables_.swap(empty_stub_table_list);
>>>
>>> It's simpler to just use resize(this->shnum()). The "swap" idiom
>>> is useful when you need to release memory, but not necessary for
>>> initialization.
>>
>> Done
>>>
>>>
>>> +  // Call parent to relocate sections.
>>> +  Sized_relobj_file<size, big_endian>::do_relocate_sections(symtab, layout,
>>> +   pshdrs, of, pviews);
>>>
>>> Indentation is off by one.
>>
>>
>> Done.
>>>
>>>
>>> +  unsigned int reloc_size;
>>> +  if (sh_type == elfcpp::SHT_RELA)
>>> +    reloc_size = elfcpp::Elf_sizes<size>::rela_size;
>>>
>>> For REL sections, reloc_size will be left uninitialized here.
>>> If you don't expect REL sections, you should assert that.
>>
>>
>> Done.
>>>
>>>
>>> +  const unsigned int shdr_size = elfcpp::Elf_sizes<size>::shdr_size;
>>> +  const elfcpp::Shdr<size, big_endian> text_shdr(pshdrs + index * shdr_size);
>>> +  return this->section_is_scannable(text_shdr, index,
>>> +    out_sections[index], symtab);
>>>
>>> The naming doesn't really make it clear that
>>> section_needs_reloc_stub_scanning is looking at the REL/RELA
>>> section, while section_is_scannable is looking at the PROGBITS
>>> section that is being relocated. You've used text_shdr for the
>>> section header, so I'd suggest using text_shndx instead of index,
>>> and text_section_is_scannable for the function name.
>>
>>
>> Done.
>>>
>>>
>>> +      // Currently this only happens for a relaxed section.
>>> +      const Output_relaxed_input_section* poris =
>>> +      out_sections[index]->find_relaxed_input_section(this, index);
>>>
>>> This continuation line needs indentation (4 spaces).
>>
>>
>> Done.
>>>
>>>
>>> +  // Get the section contents. This does work for the case in which
>>> +  // we modify the contents of an input section.  We need to pass the
>>> +  // output view under such circumstances.
>>>
>>> Should the comment read "This does *not* work ..."?
>>>
>>> I don't see you handling the case where you modify the contents,
>>> so maybe that last sentence is not necessary?
>>
>>
>> Done by removing the confusing last 2 sentences.
>>>
>>>
>>> +// Create a stub group for input sections from BEGIN to END.  OWNER
>>> +// points to the input section to be the owner a new stub table.
>>>
>>> "OWNER points to the input section that will be the owner of the
>>> stub table."
>>
>>
>> Done.
>>>
>>>
>>> +  // Currently we convert ordinary input sections into relaxed sections only
>>> +  // at this point but we may want to support creating relaxed input section
>>> +  // very early.  So we check here to see if owner is already a relaxed
>>> +  // section.
>>> +  gold_assert(!owner->is_relaxed_input_section());
>>> +
>>> +  The_aarch64_input_section* input_section;
>>> +  if (owner->is_relaxed_input_section())
>>> +    {
>>> +      input_section = static_cast<The_aarch64_input_section*>(
>>> +  owner->relaxed_input_section());
>>> +    }
>>>
>>> Given the assert above, this code can't be reached. If you want to
>>> leave it in as a stub for the future, I'd suggest removing the above
>>> assert, and replacing the then clause with gold_unreachable().
>>
>>
>> Done by replacing gold_assert by gold_unreachable inside the if clause
>> also updated the comments.
>>>
>>>
>>> +  do
>>> +    {
>>> +      if (p->is_input_section() || p->is_relaxed_input_section())
>>> + {
>>> +  // The stub table information for input sections live
>>> +  // in their objects.
>>> +  The_aarch64_relobj* aarch64_relobj =
>>> +      static_cast<The_aarch64_relobj*>(p->relobj());
>>> +  aarch64_relobj->set_stub_table(p->shndx(), stub_table);
>>> + }
>>> +      prev_p = p++;
>>> +    }
>>> +  while (prev_p != end);
>>>
>>> Suggest renaming begin/end to maybe first/last? It took me a minute
>>> to understand that "end" doesn't refer to the same "end" that an
>>> iterator would test against, and that you intended this to be an
>>> inclusive range.
>>
>>
>> Done renaming.
>>>
>>>
>>> Why not just "do { ... } while (p++ != last);"?
>>
>> Done.
>>>
>>>
>>> +// Group input sections for stub generation.  GROUP_SIZE is roughly the limit
>>> +// of stub groups.  We grow a stub group by adding input section until the
>>> +// size is just below GROUP_SIZE.  The last input section will be converted
>>> +// into a stub table.  If STUB_ALWAYS_AFTER_BRANCH is false, we also add
>>> +// input section after the stub table, effectively double the group size.
>>>
>>> Do you mean "the last input section will be converted into a stub
>>> table *owner*"?
>>>
>>> And "we also add input section*s* after the stub table, effectively
>>> *doubling* the group size."
>>
>>
>> Done.
>>>
>>>
>>> + case HAS_STUB_SECTION:
>>> +  // Adding this section makes the post stub-section group larger
>>> +  // than GROUP_SIZE.
>>> +  gold_assert(false);
>>>
>>> You can just use gold_unreachable() here.
>>>
>>> +  // ET_EXEC files are valid input for --just-symbols/-R,
>>> +  // and we treat them as relocatable objects.
>>>
>>> For --just-symbols, shouldn't you use the base implementation
>>> of do_make_elf_object?
>>
>>
>> Done by invoking parent implementation.
>>>
>>>
>>> +// This function scans a relocation sections for stub generation.
>>>
>>> s/sections/section/
>>
>>
>> Done.
>>>
>>>
>>> + // An error should never aroused in the above step. If so, please
>>> + // An error should never arouse, it is an "_NC" relocation.
>>>
>>> "arise".
>>>
>>> +    gold_error(_("Stub is too far away, try use a smaller value "
>>> + "for '--stub-group-size'. For example, 0x2000000."));
>>>
>>> Either "try using" or "try to use" or just "try a smaller value".
>>
>>
>> Done.
>>
>>
>>
>> --
>> Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330



-- 
Han Shen |  Software Engineer |  shenhan@google.com |  +1-650-440-3330

Attachment: relaxation.patch3
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]