This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][x86_64] Convert indirect call via GOT to direct when possible


>>> * x86_64.cc (can_convert_callq_to_direct): New function.
>>> Target_x86_64<size>::Scan::global: Check if an indirect call via
>>> GOT can be converted to direct.
>>> Target_x86_64<size>::Relocate::relocate: Change any indirect call
>>> via GOT that can be converted.
>>> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
>>> * testsuite/Makefile.in: Regenerate.
>>> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
>>> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>>>
>>
>> Do you need to check R_X86_64_REX_GOTPCRELX for branch?
>
> Ok, patch changed to not check for this and refactored a bit.

Thanks for the refactoring -- I was about to suggest that. There's
still too much logic outside the can_convert_* functions that is
duplicated in Scan::global() and in Relocate::relocate(), all because
we don't want to fetch the section contents without doing a few
preliminary checks. I'd like to move most of that logic into the
can_convert_* functions, so let's start with a Lazy_view class, which
will fetch the section contents only when needed:

template<int size>
class Lazy_view
{
 public:
  Lazy_view(Sized_relobj_file<size, false>* object, unsigned int data_shndx)
    : object_(object), data_shndx_(data_shndx), view_(NULL), view_size_(0)
  { }

  inline unsigned char
  operator[](size_t offset)
  {
    if (this->view_ == NULL)
      this->view_ = this->object_->section_contents(this->data_shndx_,
                                                    &this->view_size_,
                                                    true);
    if (offset >= this->view_size_)
      return 0;
    return this->view_[offset];
  }

 private:
  Sized_relobj_file<size, false>* object_;
  unsigned int data_shndx_;
  const unsigned char* view_;
  section_size_type view_size_;
};

Now we can move (almost) all of that external logic into the
can_convert_* routines, leaving us with this in Scan::global():

        Lazy_view<size> view(object, data_shndx);
        size_t r_offset = reloc.get_r_offset();
        if (r_offset >= 2
            && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
                                                           r_offset, &view))
          break;
        if (r_offset >= 2
            && Target_x86_64<size>::can_convert_callq_to_direct(gsym, r_type,

r_offset, &view))
          break;

... and this in Relocate::relocate():

        if ((gsym == NULL
             && rela.get_r_offset() >= 2
             && view[-2] == 0x8b
             && !psymval->is_ifunc_symbol())
            || (gsym != NULL
                && rela.get_r_offset() >= 2
                && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
                                                               0, &view)))
          {
            ...
          }
        else if (gsym != NULL
                 && rela.get_r_offset() >= 2
                 && Target_x86_64<size>::can_convert_callq_to_direct(gsym,
                                                                     r_type,
                                                                     0, &view))
          {
            if (view[-1] == 0x15)
              {
                ...
              }
            else
              {
                ...
              }
          }

Folding all the external logic into the can_convert_* functions, and
refactoring a bit gives this:

  template<class View_type>
  static inline bool
  can_convert_mov_to_lea(const Symbol* gsym, unsigned int r_type,
                         size_t r_offset, View_type* view)
  {
    gold_assert(gsym != NULL);
    // We cannot do the conversion unless it's one of these relocations.
    if (r_type != elfcpp::R_X86_64_GOTPCREL
        && r_type != elfcpp::R_X86_64_GOTPCRELX
        && r_type != elfcpp::R_X86_64_REX_GOTPCRELX)
      return false;
    // We cannot convert references to IFUNC symbols, or to symbols that
    // are not local to the current module.
    if (gsym->type() == elfcpp::STT_GNU_IFUNC
        || gsym->is_undefined ()
        || gsym->is_from_dynobj()
        || gsym->is_preemptible())
      return false;
    if (parameters->options().shared()
        && (gsym->visibility() == elfcpp::STV_DEFAULT
            || gsym->visibility() == elfcpp::STV_PROTECTED)
        && !parameters->options().Bsymbolic())
      return false;
    // We cannot convert references to the _DYNAMIC symbol.
    if (strcmp(gsym->name(), "_DYNAMIC") == 0)
      return false;
    // Check for a MOV opcode.
    return (*view)[r_offset - 2] == 0x8b;
  }

  template<class View_type>
  static inline bool
  can_convert_callq_to_direct(const Symbol* gsym, unsigned int r_type,
                              size_t r_offset, View_type* view)
  {
    gold_assert(gsym != NULL);
    // We cannot do the conversion unless it's a GOTPCRELX relocation.
    if (r_type != elfcpp::R_X86_64_GOTPCRELX)
      return false;
    // We cannot convert references to IFUNC symbols, or to symbols that
    // are not local to the current module.
    if (gsym->type() == elfcpp::STT_GNU_IFUNC
        || gsym->is_undefined ()
        || gsym->is_from_dynobj()
        || gsym->is_preemptible())
      return false;
    // Check for a CALLQ or JMPQ opcode.
    return ((*view)[r_offset - 2] == 0xff
            && ((*view)[r_offset - 1] == 0x15
                || (*view)[r_offset - 1] == 0x25));
  }

A couple of observations, though:

1. Sri, in your patch, you just test for sym type == STT_FUNC. Isn't
it sufficient to test for sym type != STT_GNU_IFUNC (as in the
convert-to-lea case)? I don't think it really matters -- if we see a
jump to an STT_OBJECT or STT_NOTYPE symbol, why isn't the
transformation just as valid?

2. HJ, given an R_X86_64_GOTPCRELX relocation, is it still necessary
to check the opcode during Scan::global()? Doesn't the relocation
guarantee that it's an appropriate instruction for the transformation?
I think in both cases, we could skip fetching the section contents if
we have this relocation.

3. This piece of can_convert_mov_to_lea has me a bit puzzled:

    if (parameters->options().shared()
        && (gsym->visibility() == elfcpp::STV_DEFAULT
            || gsym->visibility() == elfcpp::STV_PROTECTED)
        && !parameters->options().Bsymbolic())
      return false;

We've already tested for is_preemptible, which should take care of all
of these cases. If I remove this code, however, I get a couple of
failures that I need to investigate. At the very least, I suspect this
part of the logic can be simplified. I want to investigate this
further before committing the whole thing.

-cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]