[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
- To: Roland McGrath <roland@hack.frob.com>
- Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
- From: "Rahul Chaudhry via gnu-gabi" <gnu-gabi@sourceware.org>
- Date: Tue, 12 Dec 2017 16:53:31 -0800
- Authentication-results: sourceware.org; auth=none
- Cc: Sriraman Tallam <tmsriram@google.com>, Florian Weimer <fw@deneb.enyo.de>, Rahul Chaudhry via gnu-gabi <gnu-gabi@sourceware.org>, Suprateeka R Hegde <hegdesmailbox@gmail.com>, Florian Weimer <fweimer@redhat.com>, David Edelsohn <dje.gcc@gmail.com>, Rafael Avila de Espindola <rafael.espindola@gmail.com>, Binutils Development <binutils@sourceware.org>, Alan Modra <amodra@gmail.com>, Cary Coutant <ccoutant@gmail.com>, Xinliang David Li <davidxl@google.com>, Sterling Augustine <saugustine@google.com>, Paul Pluzhnikov <ppluzhnikov@google.com>, Ian Lance Taylor <iant@google.com>, "H.J. Lu" <hjl.tools@gmail.com>, Luis Lozano <llozano@google.com>, Peter Collingbourne <pcc@google.com>, Rui Ueyama <ruiu@google.com>, llvm-dev@lists.llvm.org
- Delivered-to: listarch-gnu-gabi@sourceware.org
- Delivered-to: mailing list gnu-gabi@sourceware.org
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=iaOQ6zdBXgZI2BchZncq38m0jzBMbQNEAs58mH4Ufko=; b=XUgX/B+/khgtepkEXrVaBZtZ6IOlMhYvjdRS+Inkh3zc6WT8xorEq/TX9t/zbACIX6 lMTVyyKZvS1EehwyU2IXuM4CpliopWq221aNI063rIsKUJFRkOjHoeXlUlswdDV3Zi+s 7vYVNNkEJPXUTQfP8IVyJM3IYAlqVGZxK+X8ASf6BWrGEyPE436jwgSLuRfepXO7OQqI np/La8sq/ZBdbwdH8OPH8zjPb8Lb68FBBatWu/n170Avm5LxnE7dOtDLts9Lr+qZd1qn 0lLleYCgbB6hd9tgXmTELTOp5lf8BILKWXML/DOreEfbM2lAZNGydoF803bIGVwpSwEF +EVQ==
- In-reply-to: <CAORpzuMftCGpXUObOyoFY0=jorMBDWEDbQJ23DifTNW3v-WA6Q@mail.gmail.com>
- List-help: <mailto:gnu-gabi-help@sourceware.org>
- List-id: <gnu-gabi.sourceware.org>
- List-post: <mailto:gnu-gabi@sourceware.org>
- List-subscribe: <mailto:gnu-gabi-subscribe@sourceware.org>
- Mailing-list: contact gnu-gabi-help@sourceware.org; run by ezmlm
- References: <CAGWvnynFwXFGLj3tAVgDatn0zmuHcWHyRNuDvR+wRZCXLnar_A@mail.gmail.com> <8737cosnym.fsf@localhost.localdomain.i-did-not-set--mail-host-address--so-tickle-me> <CAGWvnynEe3QkhDMGc=Tx8Vr44egtv3xLuh1yiVcAhv+e3GLtZg@mail.gmail.com> <a3e5c76c-8cb9-fc53-a30a-96b2c85079e1@gmail.com> <a68a5d29-09d6-e758-8680-d94f42762adf@redhat.com> <7e698a5f-32d7-6549-7e23-8850b85e6c10@gmail.com> <CAAs8Hmziqc0hebPndiGuZN=buFm=M+O+2fGCfsv_rvDro9zJZA@mail.gmail.com> <CAJRD=ooGubyUOLE6W7LHdeU2ZNDEG1A=84+P=1iOvfmD7-7GNg@mail.gmail.com> <874lozec25.fsf@mid.deneb.enyo.de> <CAAs8HmwMRTjyLjvUAbP9drkagbpedonHOGGRvoFQVr1TE7wyCQ@mail.gmail.com> <CAJRD=opP96vFuSKK-1d1jw3nOKeTDE1T_E5hDwj3Zy-VUeAnRA@mail.gmail.com> <CAORpzuMftCGpXUObOyoFY0=jorMBDWEDbQJ23DifTNW3v-WA6Q@mail.gmail.com>
- Reply-to: Rahul Chaudhry <rahulchaudhry@google.com>
- Sender: gnu-gabi-owner@sourceware.org
On Mon, Dec 11, 2017 at 6:14 PM, Roland McGrath <roland@hack.frob.com> wrote:
>
> On Mon, Dec 11, 2017 at 3:50 PM Rahul Chaudhry via gnu-gabi <gnu-gabi@sourceware.org> wrote:
>>
>> A simple combination of delta-encoding and run_length-encoding is one of the
>> first schemes we experimented with (32-bit entries with 24-bit 'delta' and an
>> 8-bit 'count'). This gave really good results, but as Sri mentions, we observed
>> several cases where the relative relocations were not on consecutive offsets.
>> There were common cases where the relocations applied to alternate words, and
>> that totally wrecked the scheme (a bunch of entries with delta==16 and
>> count==1).
>
>
> For the same issue in a different context, I recently implemented a scheme using run-length-encoding but using a variable stride. So for a run of alternate words, you still get a single entry, but with stride 16 instead of 8. In my application, most cases of strides > 8 are a run of only 2 or 3 but there are a few cases of dozens or hundreds with a stride of 16. My case is a solution tailored to exactly one application (a kernel), so there is a closed sample set that's all that matters and the trade-off between simplicity of the analysis and compactness of the results is different than the general case you're addressing (my "analysis" consists of a few lines of AWK). But I wonder if it might be worthwhile to study the effect a variable-stride RLE scheme or adding the variable-stride ability into your hybrid scheme has on your sample applications.
>
> Since we're talking about specifying a new ABI that will be serving us for many years to come and will be hard to change once deployed, it seems worth spending quite a bit of effort up front to come to the most compact scheme that's feasible.
I agree. Can you share more details of the encoding scheme that you found
useful (size of each entry, number of bits used for stride/count etc.)?
I just ran some experiments with an encoding with 32-bit entries: 16-bits for
delta, 8-bits for stride, and 8-bits for count. Here are the numbers, inlined
with those from the previous schemes for comparison:
1. Chrome browser (x86_64, built as PIE):
605159 relocation entries (24 bytes each) in '.rela.dyn'
594542 are R_X86_64_RELATIVE relocations (98.25%)
14269008 bytes (13.61MB) in use in '.rela.dyn' section
385420 bytes (0.37MB) using delta+count encoding
232540 bytes (0.22MB) using delta+stride+count encoding
109256 bytes (0.10MB) using jump+bitmap encoding
2. Go net/http test binary (x86_64, 'go test -buildmode=pie -c net/http')
83810 relocation entries (24 bytes each) in '.rela.dyn'
83804 are R_X86_64_RELATIVE relocations (99.99%)
2011296 bytes (1.92MB) in use in .rela.dyn section
204476 bytes (0.20MB) using delta+count encoding
132568 bytes (0.13MB) using delta+stride+count encoding
43744 bytes (0.04MB) using jump+bitmap encoding
3. Vim binary in /usr/bin on my workstation (Ubuntu, x86_64)
6680 relocation entries (24 bytes each) in '.rela.dyn'
6272 are R_X86_64_RELATIVE relocations (93.89%)
150528 bytes (0.14MB) in use in .rela.dyn section
14388 bytes (0.01MB) using delta+count encoding
7000 bytes (0.01MB) using delta+stride+count encoding
1992 bytes (0.00MB) using jump+bitmap encoding
delta+count encoding is using 32-bit entries:
24-bit delta: number of bytes since last offset.
8-bit count: number of relocations to apply (consecutive words).
delta+stride+count encoding is using 32-bit entries:
16-bit delta: number of bytes since last offset.
8-bit stride: stride (in bytes) for applying 'count' relocations.
8-bit count: number of relocations to apply (using 'stride').
jump+bitmap encoding is using 64-bit entries:
8-bit jump: number of words since last offset.
56-bit bitmap: bitmap for which words to apply relocations to.
While adding a 'stride' field is definitely an improvement over simple
delta+count encoding, it doesn't compare well against the bitmap based
encoding.
I took a look inside the encoding for the Vim binary. There are some instances
in the bitmap based encoding like
[0x3855555555555555 0x3855555555555555 0x3855555555555555 ...]
that encode sequences of relocations applying to alternate words. The stride
based encoding works very well on these and turns it into much more compact
[0x0ff010ff 0x0ff010ff 0x0ff010ff ...]
using stride==0x10 and count==0xff.
However, for the vast majority of cases, the stride based encoding ends up with
count <= 2, and that kills it in the end.
I could try something more complex with 16-bit entries, but that can only give
2x improvement at best, so it still won't be better than the bitmap approach.
Thanks,
Rahul
> --
>
>
> Thanks,
> Roland