This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Relocations to use when eliding plts


On Fri, May 29, 2015 at 10:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, May 29, 2015 at 8:38 AM, Richard Henderson <rth@twiddle.net> wrote:
>> On 05/28/2015 01:36 PM, Rich Felker wrote:
>>> On Thu, May 28, 2015 at 09:40:57PM +0200, Jakub Jelinek wrote:
>>>> On Thu, May 28, 2015 at 03:29:02PM -0400, Rich Felker wrote:
>>>>>> You're not missing anything.  But do you want the performance of a
>>>>>> library to depend on how the main executable is compiled?
>>>>>
>>>>> Not directly. But I'd rather be in that situation than have
>>>>> pessimizations in library codegen to avoid it. I'm worried about cases
>>>>> where code both loads the address of a function and calls it, such as
>>>>> this (stupid) example:
>>>>>
>>>>>     a((void *)a);
>>>>
>>>> That can be handled by using just one GOT slot, the non-.got.plt one;
>>>> only if there are only relocations that guarantee that address equality is
>>>> not important it would use the faster (*_JUMP_SLOT?) relocations.
>>>
>>> How far would this extend, e.g. in the case of LTO or compiling the
>>> whole library at once?
>>
>> It depends on how difficult that becomes, I suppose.  It's certainly something
>> that we can look for during LTO.
>>
>> I did in fact mention this exact point in the original message:
>>
>>> This does leave open other optimization questions, mostly around weak
>>> functions.  Consider constructs like
>>>
>>>       if (foo) foo();
>>>
>>> Do we, within the compiler, try to CSE GOTPCREL and GOTPLTPCREL, accepting the
>>> possibility (not certainty) of jump-to-jump but definitely avoiding a separate
>>> load insn and the latency implied by that?
>>
>> As a last resort the two can always be unified at static link time, so that
>> only one got slot is created, and only one runtime relocation exists.  At which
>> point we'd still have two loads in the insn stream.  But barring preemption,
>> the second load will be from cache and cost a single cycle.
>>
>> So which is less likely, this double-use of a function pointer, or a non-PIE
>> executable?
>
> Can you try hjl/no-plt branch in GCC git mirror with -fno-plt?
> I got
>
> [hjl@gnu-6 pr18458]$ make
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt   -c -o
> main.o main.c
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt -fpic
> -c -o a.o a.c
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt
> -Wl,-z,now -shared -o a.so a.o
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt -fpic
> -c -o b.o b.c
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -O2 -g -fno-plt
> -Wl,-z,now -shared -o b.so b.o a.so
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc -Wl,-rpath=. -Wl,-z,now
> -o main main.o a.so b.so
> ./main
> PASS
> [hjl@gnu-6 pr18458]$ readelf -r main
>
> Relocation section '.rela.dyn' at offset 0x4b0 contains 4 entries:
>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
> 000000600a20  000200000006 R_X86_64_GLOB_DAT 0000000000000000 b + 0
> 000000600a28  000500000006 R_X86_64_GLOB_DAT 0000000000000000
> __libc_start_main@GLIBC_2.2.5 + 0
> 000000600a30  000600000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
> 000000600a38  000800000006 R_X86_64_GLOB_DAT 0000000000000000 a + 0
> [hjl@gnu-6 pr18458]$ gdb main
> GNU gdb (GDB) Fedora 7.7.1-21.fc20
> Copyright (C) 2014 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from main...done.
> (gdb) r
> Starting program: /export/home/hjl/bugs/binutils/pr18458/main
> PASS
> [Inferior 1 (process 10663) exited normally]
> Missing separate debuginfos, use: debuginfo-install glibc-2.18-19.2.fc20.x86_64
> (gdb) b b
> Breakpoint 1 at 0x7ffff7bf75f0: file b.c, line 5.
> (gdb) r
> Starting program: /export/home/hjl/bugs/binutils/pr18458/main
>
> Breakpoint 1, b () at b.c:5
> 5  a();
> (gdb) si
> a () at a.c:5
> 5  printf("PASS\n");
> (gdb)
>

I built GCC with -fno-plt on hjl/no-plt branch with binutils users/hjl/relax
branch.  I got

[hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep addr32 | wc -l
204864
[hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep jmpq | grep %rip | wc -l
877
[hjl@gnu-mic-2 gcc]$ objdump -dw cc1plus | grep callq | grep %rip | wc -l
20099
[hjl@gnu-mic-2 gcc]$

Relocation section '.rela.plt' at offset 0x199c68 contains 50 entries:

Those come from archives which aren't compiled with -fno-plt.

Without  -fno-plt:

nu-13:pts/19[5]> objdump -dw cc1plus | grep callq | grep %rip | wc -l
2083
gnu-13:pts/19[6]> objdump -dw cc1plus | grep jmpq | grep %rip | wc -l
603
gnu-13:pts/19[7]>  objdump -dw cc1plus | grep addr32 | wc -l
0

Relocation section '.rela.plt' at offset 0x196f90 contains 514 entries:

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]