This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][x86_64] Convert indirect call via GOT to direct when possible


On Fri, Jun 10, 2016 at 1:06 PM, Cary Coutant <ccoutant@gmail.com> wrote:
>>>> * x86_64.cc (can_convert_callq_to_direct): New function.
>>>> Target_x86_64<size>::Scan::global: Check if an indirect call via
>>>> GOT can be converted to direct.
>>>> Target_x86_64<size>::Relocate::relocate: Change any indirect call
>>>> via GOT that can be converted.
>>>> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
>>>> * testsuite/Makefile.in: Regenerate.
>>>> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
>>>> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>>>>
>>>
>>> Do you need to check R_X86_64_REX_GOTPCRELX for branch?
>>
>> Ok, patch changed to not check for this and refactored a bit.
>
> Thanks for the refactoring -- I was about to suggest that. There's
> still too much logic outside the can_convert_* functions that is
> duplicated in Scan::global() and in Relocate::relocate(), all because
> we don't want to fetch the section contents without doing a few
> preliminary checks. I'd like to move most of that logic into the
> can_convert_* functions, so let's start with a Lazy_view class, which
> will fetch the section contents only when needed:
>
> template<int size>
> class Lazy_view
> {
>  public:
>   Lazy_view(Sized_relobj_file<size, false>* object, unsigned int data_shndx)
>     : object_(object), data_shndx_(data_shndx), view_(NULL), view_size_(0)
>   { }
>
>   inline unsigned char
>   operator[](size_t offset)
>   {
>     if (this->view_ == NULL)
>       this->view_ = this->object_->section_contents(this->data_shndx_,
>                                                     &this->view_size_,
>                                                     true);
>     if (offset >= this->view_size_)
>       return 0;
>     return this->view_[offset];
>   }
>
>  private:
>   Sized_relobj_file<size, false>* object_;
>   unsigned int data_shndx_;
>   const unsigned char* view_;
>   section_size_type view_size_;
> };
>
> Now we can move (almost) all of that external logic into the
> can_convert_* routines, leaving us with this in Scan::global():
>
>         Lazy_view<size> view(object, data_shndx);
>         size_t r_offset = reloc.get_r_offset();
>         if (r_offset >= 2
>             && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
>                                                            r_offset, &view))
>           break;
>         if (r_offset >= 2
>             && Target_x86_64<size>::can_convert_callq_to_direct(gsym, r_type,
>
> r_offset, &view))
>           break;
>
> ... and this in Relocate::relocate():
>
>         if ((gsym == NULL
>              && rela.get_r_offset() >= 2
>              && view[-2] == 0x8b
>              && !psymval->is_ifunc_symbol())
>             || (gsym != NULL
>                 && rela.get_r_offset() >= 2
>                 && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
>                                                                0, &view)))
>           {
>             ...
>           }
>         else if (gsym != NULL
>                  && rela.get_r_offset() >= 2
>                  && Target_x86_64<size>::can_convert_callq_to_direct(gsym,
>                                                                      r_type,
>                                                                      0, &view))
>           {
>             if (view[-1] == 0x15)
>               {
>                 ...
>               }
>             else
>               {
>                 ...
>               }
>           }
>
> Folding all the external logic into the can_convert_* functions, and
> refactoring a bit gives this:
>
>   template<class View_type>
>   static inline bool
>   can_convert_mov_to_lea(const Symbol* gsym, unsigned int r_type,
>                          size_t r_offset, View_type* view)
>   {
>     gold_assert(gsym != NULL);
>     // We cannot do the conversion unless it's one of these relocations.
>     if (r_type != elfcpp::R_X86_64_GOTPCREL
>         && r_type != elfcpp::R_X86_64_GOTPCRELX
>         && r_type != elfcpp::R_X86_64_REX_GOTPCRELX)
>       return false;
>     // We cannot convert references to IFUNC symbols, or to symbols that
>     // are not local to the current module.
>     if (gsym->type() == elfcpp::STT_GNU_IFUNC
>         || gsym->is_undefined ()
>         || gsym->is_from_dynobj()
>         || gsym->is_preemptible())
>       return false;
>     if (parameters->options().shared()
>         && (gsym->visibility() == elfcpp::STV_DEFAULT
>             || gsym->visibility() == elfcpp::STV_PROTECTED)
>         && !parameters->options().Bsymbolic())
>       return false;
>     // We cannot convert references to the _DYNAMIC symbol.
>     if (strcmp(gsym->name(), "_DYNAMIC") == 0)
>       return false;
>     // Check for a MOV opcode.
>     return (*view)[r_offset - 2] == 0x8b;
>   }
>
>   template<class View_type>
>   static inline bool
>   can_convert_callq_to_direct(const Symbol* gsym, unsigned int r_type,
>                               size_t r_offset, View_type* view)
>   {
>     gold_assert(gsym != NULL);
>     // We cannot do the conversion unless it's a GOTPCRELX relocation.
>     if (r_type != elfcpp::R_X86_64_GOTPCRELX)
>       return false;
>     // We cannot convert references to IFUNC symbols, or to symbols that
>     // are not local to the current module.
>     if (gsym->type() == elfcpp::STT_GNU_IFUNC
>         || gsym->is_undefined ()
>         || gsym->is_from_dynobj()
>         || gsym->is_preemptible())
>       return false;
>     // Check for a CALLQ or JMPQ opcode.
>     return ((*view)[r_offset - 2] == 0xff
>             && ((*view)[r_offset - 1] == 0x15
>                 || (*view)[r_offset - 1] == 0x25));
>   }
>
> A couple of observations, though:
>
> 1. Sri, in your patch, you just test for sym type == STT_FUNC. Isn't
> it sufficient to test for sym type != STT_GNU_IFUNC (as in the
> convert-to-lea case)? I don't think it really matters -- if we see a
> jump to an STT_OBJECT or STT_NOTYPE symbol, why isn't the
> transformation just as valid?
>
> 2. HJ, given an R_X86_64_GOTPCRELX relocation, is it still necessary
> to check the opcode during Scan::global()? Doesn't the relocation
> guarantee that it's an appropriate instruction for the transformation?
> I think in both cases, we could skip fetching the section contents if
> we have this relocation.
>
> 3. This piece of can_convert_mov_to_lea has me a bit puzzled:
>
>     if (parameters->options().shared()
>         && (gsym->visibility() == elfcpp::STV_DEFAULT
>             || gsym->visibility() == elfcpp::STV_PROTECTED)
>         && !parameters->options().Bsymbolic())
>       return false;

This code seems to cover the one extra case about a protected symbol
in a shared object.  A protected symbol defined in a shared object is
not pre-emptible and is_preemptible() will return false.  However,
this code suggests the transformation is invalid in that case and  I
dont understand why.  If this code is being overly conservative, then
this check can be removed and the test cases deleted.  Nevertheless,
this code can be simplified to:

    if (parameters->options().shared()
        && gsym->visibility() == elfcpp::STV_PROTECTED)
      return false;

>
> We've already tested for is_preemptible, which should take care of all
> of these cases. If I remove this code, however, I get a couple of
> failures that I need to investigate. At the very least, I suspect this
> part of the logic can be simplified. I want to investigate this
> further before committing the whole thing.
>
> -cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]