This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Allow pie links to create PLT entries


On Thu, Jan 29, 2015 at 3:13 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jan 29, 2015 at 2:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Thu, Jan 29, 2015 at 12:17 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, Jan 29, 2015 at 12:08 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Thu, Jan 29, 2015 at 11:48 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Thu, Jan 29, 2015 at 11:00 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>>     Here is a simple example that fails to link with -pie but which
>>>>>> should work just fine without having to use -fPIE.
>>>>>>
>>>>>> foo.cc
>>>>>> ======
>>>>>> int extern_func();
>>>>>> int main()
>>>>>> {
>>>>>>   extern_func();
>>>>>>   return 0;
>>>>>> }
>>>>>>
>>>>>> bar.cc
>>>>>> =====
>>>>>> int extern_func()
>>>>>> {
>>>>>>   return 1;
>>>>>> }
>>>>>>
>>>>>> $ g++ -fPIC -shared bar.cc -o libbar.so
>>>>>> $ g++ foo.cc -lbar -pie
>>>>>>
>>>>>> ld: error: foo.o: requires dynamic R_X86_64_PC32 reloc against
>>>>>> '_Z11extern_funcv' which may overflow at runtime; recompile with -fPIC
>>>>>>
>>>>>> It fails because the linker disallows creating a PLT for
>>>>>> R_X86_64_PC32 reloc when it is perfectly fine to do so.  Note that I
>>>>>> could have recompiled foo.cc with -fPIE or -fPIC but I still think
>>>>>> this can be allowed.  With support for copy relocations in pie in gold
>>>>>> and with this support, the cases where we would need to use -fPIE to
>>>>>> get working pie links is smaller.  This would help us link non-PIE
>>>>>> objects into pie executables.
>>>>>
>>>>> You can't do it for x86 since EBX isn't setup for calling via PLT.
>>>>> For x86-64, there should be little difference between PIE
>>>>> and non-PIE code.
>>>>
>>>> True but that little difference is sometimes causing non-trivial
>>>> performance penalties. With copyrelocations support for PIE added
>>>> recently, one big difference causing non-trivial performance penalty
>>>> went away.  However, there are still differences in the way global
>>>> arrays are accessed.  For instance,
>>>>
>>>> uint32 a[] = {1, 2, 3, 4}
>>>>
>>>> a[i] can be accessed with one insn without -fPIE, whereas with -fPIE,
>>>> we need two. One extra to get the 64-bit address of a.
>>>>
>>>> Without -fPIE:
>>>>
>>>> movslq   0x1655(%rip),%rax  # 401b80 <i>
>>>> mov    0x401b30(,%rax,4),%esi # a[i]
>
> If you link it with -pie, you will have TEXTREL in executable.
> Do you want relocations in text sections in PIE?

I have been told TEXTRELs are not preferred though I never understood why.

Just to make sure I understand, are you saying that the absolute
address in the case of -pie will be a text relocation?   I think that
is not true because this mov instruction

mov    0x401b30(,%rax,4),%esi

does not allow a 64-bit absolute value which is needed for -pie.  What
I was instead suggesting is to  make that PC-relative like:

mov    0xabcd(%rip,%rax,4),%esi

which would not need a text relocation.  However, I do not think such
an insn is supported yet, thought it would be useful.

Thanks
Sri


>
>>>> With -fPIE:
>>>>
>>>> movslq 0x16c5(%rip),%rdx        # <i>
>>>> lea    0x166e(%rip),%rax      # <&a>
>>>> mov    (%rax,%rdx,4),%esi   # a[i]
>>>>
>>>> I wish we could use just one insn to do the last two in the -fPIE
>>>> case, using PC-relative addressing like:
>>>> mov  0x166e(%rip, %rdx, 4), %esi
>>>
>>> Can you improve GCC codegen for this?
>>
>> I didnt find an instruction similar to that which I could use.  Is there one?
>>
>>  I implemented an
>>> optimization in ld to convert
>>>
>>>    mov foo@GOTPCREL(%rip), %reg
>>>    to
>>>    lea foo(%rip), %reg
>>>
>>> for the locally defined symbol, foo.  It improves PIE performance
>>> by as much as 10%.  You may want to implement it in gold.  See
>>> elf_x86_64_convert_mov_to_lea for details.
>>
>> Wow, this is cool! But, with copy relocations support for PIE, I think
>> this should be fixed since the compiler can safely assume that the
>> global is defined in the executable no matter what.  Do you have an
>> example where foo@GOTPCREL is still used for globals?
>>
>> foo.cc
>> ---------
>> extern int a;
>> int main()
>> {
>>   printf("%p", &a);
>> }
>>
>> Before copyrelocations support for PIE check in GCC:
>>
>> foo.s
>> ------
>>
>> ....
>> movq a@GOTPCREL(%rip), %rax
>> .....
>>
>> and after copyrelocs support:
>>
>> foo.s
>> ------
>>
>> .......
>> leaq a(%rip), %rsi
>> ......
>>
>> Did I miss something?
>>
>>
>
> If you don't have GOTPCREL relocations against locally
> defined symbols, this optimization won't apply.
>
> --
> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]