This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi Cary, thanks. Attached the updated patch. The only diff from patch2 is the "hash_value" functionï which I directly pasted below for ease of review. --- a/gold/aarch64.cc +++ b/gold/aarch64.cc @@ -496,26 +496,20 @@ class Reloc_stub // Return a hash value. size_t hash_value() const { - // This will occupy the lowest 20 bits in the final hash value. - size_t name_hash_value = 0XFFFFFF & gold::string_hash<char>( + size_t name_hash_value = gold::string_hash<char>( (this->r_sym_ != Reloc_stub::invalid_index) ? this->u_.relobj->name().c_str() : this->u_.symbol->name()); - // We only have 4 stub types, so we allocate 2 bits for it in the final - // hash value, and they will occupy bit 21 - 22. + // We only have 4 stub types. size_t stub_type_hash_value = 0x03 & this->stub_type_; - // 5 bits for r_sym_, even if it's a invalid_index. - size_t r_sym_hash_value = this->r_sym_ & 0x1F; - // 5 bits for addend_. - size_t addend_hash_value = this->addend_ & 0x1F; - return name_hash_value - | (stub_type_hash_value << 20) - | (r_sym_hash_value << 22) - | (addend_hash_value << 27); + return (name_hash_value + ^ stub_type_hash_value + ^ ((this->r_sym_ & 0x3fff) << 2) + ^ ((this->addend_ & 0xffff) << 16)); } // Functors for STL associative containers. struct hash { On Wed, Oct 15, 2014 at 9:59 AM, Cary Coutant <ccoutant@google.com> wrote: > + return name_hash_value > + | (stub_type_hash_value << 20) > + | (r_sym_hash_value << 22) > + | (addend_hash_value << 27); > > Please add parentheses around the whole expression, then line up the > "|" operators just inside the open paren (standard Gnu style for > continued expressions). > > My concern with the previous hash function wasn't about XOR'ing the > additional bits into the string hash, but with the fact that, e.g., > r_sym, stub_type, and addend were all effectively XOR'ed together > without any multiplicative factor to keep them from canceling one > another out. Your new function avoids that, but drops all but the > lower 5 bits of addend and r_sym. I'm not sure what distribution > you're expecting, so it's not immediately clear to me how good this > hash is. If you go back to XOR, there's no need to mask or shift away > any of the bits of the string hash -- just do something to keep the > three other fields from canceling. One way is to mask and shift them > into non-overlapping positions, as you've done here, but you can keep > more of the bits; something like this: > > size_t name_hash_value = gold::string_hash<char>(...); > return (name_hash_value > ^ stub_type_hash_value > ^ ((r_sym & 0x3fff) << 2) > ^ ((addend & 0xffff) << 16)); I choose to use this straightforward and faster implementation. > > A more robust approach would be to use a hash combine function. > Unfortunately, I don't think STL provides one, but Boost has > hash_combine(), and libiberty has an equivalent, iterative_hash() > (#include "hashtab.h"). You could use iterative_hash something like > this: > > hashval_t h = static_cast<hashval_t>(gold::string_hash<char>(...); > h = iterative_hash_object(this->stub_type, h); > h = iterative_hash_object(r_sym, h); > h = iterative_hash_object(addend, h); > return static_cast<size_t>(h); > > It's only a 32-bit hash function, but that's probably sufficient. If > you don't want to throw away the full string_hash value, you could do > this instead: > > hashval_t h = 0; > h = iterative_hash_object(this->stub_type, h); > h = iterative_hash_object(r_sym, h); > h = iterative_hash_object(addend, h); > return (gold::string_hash<char>(...) > ^ static_cast<size_t>(h); > > Using iterative_hash might be noticeably slower than the simpler XOR > strategy, so I'm OK with either. > > -cary > > > On Mon, Oct 13, 2014 at 1:31 PM, HÃn ShÄn (ææ) <shenhan@google.com> wrote: >> Hi Cary, thanks. All done in below. >> >> Attached modified patch - relaxation.patch2. Also attached the diff >> from patch2 to patch1 for ease of review. >> >> Thanks, >> Han >> >> On Fri, Oct 10, 2014 at 4:28 PM, Cary Coutant <ccoutant@google.com> wrote: >>> >>> > Here we have the patch for gold aarch64 backend to support relaxation. >>> > >>> > In short relaxation is the linker's generation of stubs that fixes the >>> > out-of-range jumps/branches in the original object file. >>> > >>> > With this implementation, we are able to link a 456MB aarch64 application >>> > (correctness of the result file, though, hasn't been verified.) >>> >>> Thanks. Here are my comments... >>> >>> -cary >>> >>> >>> + struct >>> + { >>> + // For a global symbol, the symbol itself. >>> + Symbol* symbol; >>> + } global; >>> + struct >>> + { >>> + // For a local symbol, the object defining object. >>> >>> I know this is actually unchanged code from before, but this >>> comment looks wrong. Should probably be "the defining object" >>> or "the object defining the symbol". >> >> >> Done >>> >>> >>> + // Do not change the value of the enums, they are used to index into >>> + // stub_insns array. >>> >>> In that case, you should probably assign values explicitly to each >>> enum constant. Enum constants like this should generally be all >>> upper case (like macros and static constants). >> >> >> Done - Changed to all upper case and explicitly assigned values to them. >>> >>> >>> + // Determine the stub type for a certain relocation or st_none, if no stub is >>> + // needed. >>> + static Stub_type >>> + stub_type_for_reloc(unsigned int r_type, AArch64_address address, >>> + AArch64_address target); >>> + >>> + public: >>> >>> Unnecessary: everything above this was also public. >> >> >> Done >>> >>> >>> + // The key class used to index the stub instance in the stub >>> table's stub map. >>> + class Key >>> + { >>> + public: >>> >>> "public" should be indented one space. >> >> >> Done >>> >>> >>> + // Return a hash value. >>> + size_t >>> + hash_value() const >>> + { >>> + return (this->stub_type_ >>> + ^ this->r_sym_ >>> + ^ gold::string_hash<char>( >>> + (this->r_sym_ != Reloc_stub::invalid_index) >>> + ? this->u_.relobj->name().c_str() >>> + : this->u_.symbol->name()) >>> + ^ this->addend_); >>> >>> I know this is just copied from arm.cc, but it looks to me like >>> this hash function might be a bit weak. In particular, >>> stub_type_, r_sym_, and addend_ might easily cancel one another >>> out, causing some preventable collisions. r_sym_ is also >>> unnecessary (but harmless, I guess), when it's equal to >>> invalid_index. I'd suggest at least shifting stub_type_, r_sym_, >>> and addend_ by different amounts, or applying a different >>> multiplier to each to distribute the bits around more, and >>> applying r_sym_ only when it's a local symbol index. >> >> >> Done by placing name hash at bit 0-19, stub_type 20-21, r_sym 22-26 >> (even if it's a invalid_index) and addend 27-31. >>> >>> >>> + struct equal_to >>> + { >>> + bool >>> + operator()(const Key& k1, const Key& k2) const >>> + { return k1.eq(k2); } >>> + }; >>> + >>> + private: >>> >>> "private" should be indented one space. >> >> >> Done. >>> >>> >>> + // Initialize look-up tables. >>> + Stub_table_list empty_stub_table_list(this->shnum(), NULL); >>> + this->stub_tables_.swap(empty_stub_table_list); >>> >>> It's simpler to just use resize(this->shnum()). The "swap" idiom >>> is useful when you need to release memory, but not necessary for >>> initialization. >> >> Done >>> >>> >>> + // Call parent to relocate sections. >>> + Sized_relobj_file<size, big_endian>::do_relocate_sections(symtab, layout, >>> + pshdrs, of, pviews); >>> >>> Indentation is off by one. >> >> >> Done. >>> >>> >>> + unsigned int reloc_size; >>> + if (sh_type == elfcpp::SHT_RELA) >>> + reloc_size = elfcpp::Elf_sizes<size>::rela_size; >>> >>> For REL sections, reloc_size will be left uninitialized here. >>> If you don't expect REL sections, you should assert that. >> >> >> Done. >>> >>> >>> + const unsigned int shdr_size = elfcpp::Elf_sizes<size>::shdr_size; >>> + const elfcpp::Shdr<size, big_endian> text_shdr(pshdrs + index * shdr_size); >>> + return this->section_is_scannable(text_shdr, index, >>> + out_sections[index], symtab); >>> >>> The naming doesn't really make it clear that >>> section_needs_reloc_stub_scanning is looking at the REL/RELA >>> section, while section_is_scannable is looking at the PROGBITS >>> section that is being relocated. You've used text_shdr for the >>> section header, so I'd suggest using text_shndx instead of index, >>> and text_section_is_scannable for the function name. >> >> >> Done. >>> >>> >>> + // Currently this only happens for a relaxed section. >>> + const Output_relaxed_input_section* poris = >>> + out_sections[index]->find_relaxed_input_section(this, index); >>> >>> This continuation line needs indentation (4 spaces). >> >> >> Done. >>> >>> >>> + // Get the section contents. This does work for the case in which >>> + // we modify the contents of an input section. We need to pass the >>> + // output view under such circumstances. >>> >>> Should the comment read "This does *not* work ..."? >>> >>> I don't see you handling the case where you modify the contents, >>> so maybe that last sentence is not necessary? >> >> >> Done by removing the confusing last 2 sentences. >>> >>> >>> +// Create a stub group for input sections from BEGIN to END. OWNER >>> +// points to the input section to be the owner a new stub table. >>> >>> "OWNER points to the input section that will be the owner of the >>> stub table." >> >> >> Done. >>> >>> >>> + // Currently we convert ordinary input sections into relaxed sections only >>> + // at this point but we may want to support creating relaxed input section >>> + // very early. So we check here to see if owner is already a relaxed >>> + // section. >>> + gold_assert(!owner->is_relaxed_input_section()); >>> + >>> + The_aarch64_input_section* input_section; >>> + if (owner->is_relaxed_input_section()) >>> + { >>> + input_section = static_cast<The_aarch64_input_section*>( >>> + owner->relaxed_input_section()); >>> + } >>> >>> Given the assert above, this code can't be reached. If you want to >>> leave it in as a stub for the future, I'd suggest removing the above >>> assert, and replacing the then clause with gold_unreachable(). >> >> >> Done by replacing gold_assert by gold_unreachable inside the if clause >> also updated the comments. >>> >>> >>> + do >>> + { >>> + if (p->is_input_section() || p->is_relaxed_input_section()) >>> + { >>> + // The stub table information for input sections live >>> + // in their objects. >>> + The_aarch64_relobj* aarch64_relobj = >>> + static_cast<The_aarch64_relobj*>(p->relobj()); >>> + aarch64_relobj->set_stub_table(p->shndx(), stub_table); >>> + } >>> + prev_p = p++; >>> + } >>> + while (prev_p != end); >>> >>> Suggest renaming begin/end to maybe first/last? It took me a minute >>> to understand that "end" doesn't refer to the same "end" that an >>> iterator would test against, and that you intended this to be an >>> inclusive range. >> >> >> Done renaming. >>> >>> >>> Why not just "do { ... } while (p++ != last);"? >> >> Done. >>> >>> >>> +// Group input sections for stub generation. GROUP_SIZE is roughly the limit >>> +// of stub groups. We grow a stub group by adding input section until the >>> +// size is just below GROUP_SIZE. The last input section will be converted >>> +// into a stub table. If STUB_ALWAYS_AFTER_BRANCH is false, we also add >>> +// input section after the stub table, effectively double the group size. >>> >>> Do you mean "the last input section will be converted into a stub >>> table *owner*"? >>> >>> And "we also add input section*s* after the stub table, effectively >>> *doubling* the group size." >> >> >> Done. >>> >>> >>> + case HAS_STUB_SECTION: >>> + // Adding this section makes the post stub-section group larger >>> + // than GROUP_SIZE. >>> + gold_assert(false); >>> >>> You can just use gold_unreachable() here. >>> >>> + // ET_EXEC files are valid input for --just-symbols/-R, >>> + // and we treat them as relocatable objects. >>> >>> For --just-symbols, shouldn't you use the base implementation >>> of do_make_elf_object? >> >> >> Done by invoking parent implementation. >>> >>> >>> +// This function scans a relocation sections for stub generation. >>> >>> s/sections/section/ >> >> >> Done. >>> >>> >>> + // An error should never aroused in the above step. If so, please >>> + // An error should never arouse, it is an "_NC" relocation. >>> >>> "arise". >>> >>> + gold_error(_("Stub is too far away, try use a smaller value " >>> + "for '--stub-group-size'. For example, 0x2000000.")); >>> >>> Either "try using" or "try to use" or just "try a smaller value". >> >> >> Done. >> >> >> >> -- >> Han Shen | Software Engineer | shenhan@google.com | +1-650-440-3330 -- Han Shen | Software Engineer | shenhan@google.com | +1-650-440-3330
Attachment:
relaxation.patch3
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |