This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH][x86_64] Convert indirect call via GOT to direct when possible
- From: Sriraman Tallam <tmsriram at google dot com>
- To: Cary Coutant <ccoutant at gmail dot com>
- Cc: "H.J. Lu" <hjl dot tools at gmail dot com>, binutils <binutils at sourceware dot org>, David Li <davidxl at google dot com>
- Date: Mon, 13 Jun 2016 16:04:10 -0700
- Subject: Re: [PATCH][x86_64] Convert indirect call via GOT to direct when possible
- Authentication-results: sourceware.org; auth=none
- References: <CAAs8HmxxdBpS7w8udZgK0QFi5TnenU3wGhpPfhWeKE8Tr=thvA at mail dot gmail dot com> <CAMe9rOpk3aOK5mMkKvYQyzeQxJ-h8o+3KjLRikKSkLmMfqoUtg at mail dot gmail dot com> <CAAs8Hmw2KQ2neDNP5cnQPBVBZMJthvQGTARPiwa-NfAx5R6ugw at mail dot gmail dot com> <CAMe9rOrOyYv0+svcObyaBcoYbAWZTadEPm-mAGQUFFyNjPgctg at mail dot gmail dot com> <CAAs8HmzoxWe2YpvjviV-bs2BRotGa_WFWbCyyLh-_L=s00yxjQ at mail dot gmail dot com> <CAJimCsF4h+e_Ey-fGxPXM0h8ZJ-SmSzyP9XoYaNuEOByJSm=1A at mail dot gmail dot com>
On Fri, Jun 10, 2016 at 1:06 PM, Cary Coutant <ccoutant@gmail.com> wrote:
>>>> * x86_64.cc (can_convert_callq_to_direct): New function.
>>>> Target_x86_64<size>::Scan::global: Check if an indirect call via
>>>> GOT can be converted to direct.
>>>> Target_x86_64<size>::Relocate::relocate: Change any indirect call
>>>> via GOT that can be converted.
>>>> * testsuite/Makefile.am (x86_64_indirect_call_to_direct.sh): New test.
>>>> * testsuite/Makefile.in: Regenerate.
>>>> * testsuite/x86_64_indirect_call_to_direct1.s: New file.
>>>> * testsuite/x86_64_indirect_jump_to_direct1.s: New file.
>>>>
>>>
>>> Do you need to check R_X86_64_REX_GOTPCRELX for branch?
>>
>> Ok, patch changed to not check for this and refactored a bit.
>
> Thanks for the refactoring -- I was about to suggest that. There's
> still too much logic outside the can_convert_* functions that is
> duplicated in Scan::global() and in Relocate::relocate(), all because
> we don't want to fetch the section contents without doing a few
> preliminary checks. I'd like to move most of that logic into the
> can_convert_* functions, so let's start with a Lazy_view class, which
> will fetch the section contents only when needed:
>
> template<int size>
> class Lazy_view
> {
> public:
> Lazy_view(Sized_relobj_file<size, false>* object, unsigned int data_shndx)
> : object_(object), data_shndx_(data_shndx), view_(NULL), view_size_(0)
> { }
>
> inline unsigned char
> operator[](size_t offset)
> {
> if (this->view_ == NULL)
> this->view_ = this->object_->section_contents(this->data_shndx_,
> &this->view_size_,
> true);
> if (offset >= this->view_size_)
> return 0;
> return this->view_[offset];
> }
>
> private:
> Sized_relobj_file<size, false>* object_;
> unsigned int data_shndx_;
> const unsigned char* view_;
> section_size_type view_size_;
> };
>
> Now we can move (almost) all of that external logic into the
> can_convert_* routines, leaving us with this in Scan::global():
>
> Lazy_view<size> view(object, data_shndx);
> size_t r_offset = reloc.get_r_offset();
> if (r_offset >= 2
> && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
> r_offset, &view))
> break;
> if (r_offset >= 2
> && Target_x86_64<size>::can_convert_callq_to_direct(gsym, r_type,
>
> r_offset, &view))
> break;
>
> ... and this in Relocate::relocate():
>
> if ((gsym == NULL
> && rela.get_r_offset() >= 2
> && view[-2] == 0x8b
> && !psymval->is_ifunc_symbol())
> || (gsym != NULL
> && rela.get_r_offset() >= 2
> && Target_x86_64<size>::can_convert_mov_to_lea(gsym, r_type,
> 0, &view)))
> {
> ...
> }
> else if (gsym != NULL
> && rela.get_r_offset() >= 2
> && Target_x86_64<size>::can_convert_callq_to_direct(gsym,
> r_type,
> 0, &view))
> {
> if (view[-1] == 0x15)
> {
> ...
> }
> else
> {
> ...
> }
> }
>
> Folding all the external logic into the can_convert_* functions, and
> refactoring a bit gives this:
>
> template<class View_type>
> static inline bool
> can_convert_mov_to_lea(const Symbol* gsym, unsigned int r_type,
> size_t r_offset, View_type* view)
> {
> gold_assert(gsym != NULL);
> // We cannot do the conversion unless it's one of these relocations.
> if (r_type != elfcpp::R_X86_64_GOTPCREL
> && r_type != elfcpp::R_X86_64_GOTPCRELX
> && r_type != elfcpp::R_X86_64_REX_GOTPCRELX)
> return false;
> // We cannot convert references to IFUNC symbols, or to symbols that
> // are not local to the current module.
> if (gsym->type() == elfcpp::STT_GNU_IFUNC
> || gsym->is_undefined ()
> || gsym->is_from_dynobj()
> || gsym->is_preemptible())
> return false;
> if (parameters->options().shared()
> && (gsym->visibility() == elfcpp::STV_DEFAULT
> || gsym->visibility() == elfcpp::STV_PROTECTED)
> && !parameters->options().Bsymbolic())
> return false;
> // We cannot convert references to the _DYNAMIC symbol.
> if (strcmp(gsym->name(), "_DYNAMIC") == 0)
> return false;
> // Check for a MOV opcode.
> return (*view)[r_offset - 2] == 0x8b;
> }
>
> template<class View_type>
> static inline bool
> can_convert_callq_to_direct(const Symbol* gsym, unsigned int r_type,
> size_t r_offset, View_type* view)
> {
> gold_assert(gsym != NULL);
> // We cannot do the conversion unless it's a GOTPCRELX relocation.
> if (r_type != elfcpp::R_X86_64_GOTPCRELX)
> return false;
> // We cannot convert references to IFUNC symbols, or to symbols that
> // are not local to the current module.
> if (gsym->type() == elfcpp::STT_GNU_IFUNC
> || gsym->is_undefined ()
> || gsym->is_from_dynobj()
> || gsym->is_preemptible())
> return false;
> // Check for a CALLQ or JMPQ opcode.
> return ((*view)[r_offset - 2] == 0xff
> && ((*view)[r_offset - 1] == 0x15
> || (*view)[r_offset - 1] == 0x25));
> }
>
> A couple of observations, though:
>
> 1. Sri, in your patch, you just test for sym type == STT_FUNC. Isn't
> it sufficient to test for sym type != STT_GNU_IFUNC (as in the
> convert-to-lea case)? I don't think it really matters -- if we see a
> jump to an STT_OBJECT or STT_NOTYPE symbol, why isn't the
> transformation just as valid?
>
> 2. HJ, given an R_X86_64_GOTPCRELX relocation, is it still necessary
> to check the opcode during Scan::global()? Doesn't the relocation
> guarantee that it's an appropriate instruction for the transformation?
> I think in both cases, we could skip fetching the section contents if
> we have this relocation.
>
> 3. This piece of can_convert_mov_to_lea has me a bit puzzled:
>
> if (parameters->options().shared()
> && (gsym->visibility() == elfcpp::STV_DEFAULT
> || gsym->visibility() == elfcpp::STV_PROTECTED)
> && !parameters->options().Bsymbolic())
> return false;
This code seems to cover the one extra case about a protected symbol
in a shared object. A protected symbol defined in a shared object is
not pre-emptible and is_preemptible() will return false. However,
this code suggests the transformation is invalid in that case and I
dont understand why. If this code is being overly conservative, then
this check can be removed and the test cases deleted. Nevertheless,
this code can be simplified to:
if (parameters->options().shared()
&& gsym->visibility() == elfcpp::STV_PROTECTED)
return false;
>
> We've already tested for is_preemptible, which should take care of all
> of these cases. If I remove this code, however, I get a couple of
> failures that I need to investigate. At the very least, I suspect this
> part of the logic can be simplified. I want to investigate this
> further before committing the whole thing.
>
> -cary