This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Allow pie links to create PLT entries


On Thu, Jan 29, 2015 at 3:31 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Jan 29, 2015 at 3:13 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Jan 29, 2015 at 2:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Thu, Jan 29, 2015 at 12:17 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, Jan 29, 2015 at 12:08 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> On Thu, Jan 29, 2015 at 11:48 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>> On Thu, Jan 29, 2015 at 11:00 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>     Here is a simple example that fails to link with -pie but which
>>>>>>> should work just fine without having to use -fPIE.
>>>>>>>
>>>>>>> foo.cc
>>>>>>> ======
>>>>>>> int extern_func();
>>>>>>> int main()
>>>>>>> {
>>>>>>>   extern_func();
>>>>>>>   return 0;
>>>>>>> }
>>>>>>>
>>>>>>> bar.cc
>>>>>>> =====
>>>>>>> int extern_func()
>>>>>>> {
>>>>>>>   return 1;
>>>>>>> }
>>>>>>>
>>>>>>> $ g++ -fPIC -shared bar.cc -o libbar.so
>>>>>>> $ g++ foo.cc -lbar -pie
>>>>>>>
>>>>>>> ld: error: foo.o: requires dynamic R_X86_64_PC32 reloc against
>>>>>>> '_Z11extern_funcv' which may overflow at runtime; recompile with -fPIC
>>>>>>>
>>>>>>> It fails because the linker disallows creating a PLT for
>>>>>>> R_X86_64_PC32 reloc when it is perfectly fine to do so.  Note that I
>>>>>>> could have recompiled foo.cc with -fPIE or -fPIC but I still think
>>>>>>> this can be allowed.  With support for copy relocations in pie in gold
>>>>>>> and with this support, the cases where we would need to use -fPIE to
>>>>>>> get working pie links is smaller.  This would help us link non-PIE
>>>>>>> objects into pie executables.
>>>>>>
>>>>>> You can't do it for x86 since EBX isn't setup for calling via PLT.
>>>>>> For x86-64, there should be little difference between PIE
>>>>>> and non-PIE code.
>>>>>
>>>>> True but that little difference is sometimes causing non-trivial
>>>>> performance penalties. With copyrelocations support for PIE added
>>>>> recently, one big difference causing non-trivial performance penalty
>>>>> went away.  However, there are still differences in the way global
>>>>> arrays are accessed.  For instance,
>>>>>
>>>>> uint32 a[] = {1, 2, 3, 4}
>>>>>
>>>>> a[i] can be accessed with one insn without -fPIE, whereas with -fPIE,
>>>>> we need two. One extra to get the 64-bit address of a.
>>>>>
>>>>> Without -fPIE:
>>>>>
>>>>> movslq   0x1655(%rip),%rax  # 401b80 <i>
>>>>> mov    0x401b30(,%rax,4),%esi # a[i]
>>
>> If you link it with -pie, you will have TEXTREL in executable.
>> Do you want relocations in text sections in PIE?
>>
>>>>> With -fPIE:
>>>>>
>>>>> movslq 0x16c5(%rip),%rdx        # <i>
>>>>> lea    0x166e(%rip),%rax      # <&a>
>>>>> mov    (%rax,%rdx,4),%esi   # a[i]
>>>>>
>>>>> I wish we could use just one insn to do the last two in the -fPIE
>>>>> case, using PC-relative addressing like:
>>>>> mov  0x166e(%rip, %rdx, 4), %esi
>>>>
>>>> Can you improve GCC codegen for this?
>>>
>>> I didnt find an instruction similar to that which I could use.  Is there one?
>>>
>>>  I implemented an
>>>> optimization in ld to convert
>>>>
>>>>    mov foo@GOTPCREL(%rip), %reg
>>>>    to
>>>>    lea foo(%rip), %reg
>>>>
>>>> for the locally defined symbol, foo.  It improves PIE performance
>>>> by as much as 10%.  You may want to implement it in gold.  See
>>>> elf_x86_64_convert_mov_to_lea for details.
>>>
>>> Wow, this is cool! But, with copy relocations support for PIE, I think
>>> this should be fixed since the compiler can safely assume that the
>>> global is defined in the executable no matter what.  Do you have an
>>> example where foo@GOTPCREL is still used for globals?
>>>
>>> foo.cc
>>> ---------
>>> extern int a;
>>> int main()
>>> {
>>>   printf("%p", &a);
>>> }
>>>
>>> Before copyrelocations support for PIE check in GCC:
>>>
>>> foo.s
>>> ------
>>>
>>> ....
>>> movq a@GOTPCREL(%rip), %rax
>>> .....
>>>
>>> and after copyrelocs support:
>>>
>>> foo.s
>>> ------
>>>
>>> .......
>>> leaq a(%rip), %rsi
>>> ......
>>>
>>> Did I miss something?
>>>
>>>
>>
>> If you don't have GOTPCREL relocations against locally
>> defined symbols, this optimization won't apply.
>
> The same libstdc++.so.6.0.21 from GCC 5 today on Linux/x86-64.
> With ld.bfd:
>
> [hjl@gnu-6 src]$ readelf -r /tmp/libstdc++.so.6.0.21 |wc -l
> 4659
> [hjl@gnu-6 src]$
>
> with ld.gold:
>
> [hjl@gnu-6 src]$ readelf -r .libs/libstdc++.so.6.0.21 |wc -l
> 5516
> [hjl@gnu-6 src]$
>

BTW, my GOTPCREL seems to be triggered 64 times in
libstdc++.so build.


H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]