[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: Audit external function called indirectly via GOT



On 03/21/2018 11:16 AM, Cary Coutant wrote:
>> Auditing of external function calls and their return values relies on
>> lazy binding with PLT.  When external functions are called indirectly
>> via GOT without using PLT, auditing stops working.
> 
> Could you give a little background here? Why does it stop working?
> What does auditing rely on? I didn't find anything about this in the
> psABI document.

To be specific we are talking about the Solaris LD_AUDIT support that is
implemented in the GNU dynamic loader ld.so. This has been a very useful
thing for developers to have, particularly those working on schemes that
alter lookup paths or binding rules. Also those that use these hooks to
do other useful auditing. There were a lot of Solaris LD_AUDIT users, and
now there are a lot of users that use this same feature in the GNU tools.

The problem comes when you build with -fno-plt, or if you elide a PLT slot
for any other reason, there is no longer a place for the LD_AUDIT
infrastructure to hook into.

In the case of x86 the -fno-plt generated code is a direct call through
the GOT. The GOT is RO after relocation (relro), and so most tooling expects
that it cannot be changed. Therefore it's not entirely kosher to reuse the
GOT for this purpose, though you could do that, in fact on x86 the GLOB_DAT
reloc and GOT entry look an awful lot like a function descriptor and a call
through that function descriptor (for arches that have non-code PLTs).

By keeping the generation of the PLT slot, but not using it, you can go back
and re-use that PLT entry for auditing. If you are RELRO then you are going
to pay a performance cost for turning on auditing, you will be forced to
go through the PLT call sequence every time, enter the loader, find your
already computed resolution in the loader's cache, and continue. If you are
non-RELRO you can finalize the binding in the PLT.

Again, all of this is to support LD_AUDIT, which traditionally used PLT
entries and I'd like to keep this developer tooling working even in the
presence of optimized binaries.

> Here is a proposal to support auditing of external function called
>> indirectly via GOT:
>>
>> 1. Add optional dynamic tags:
>>
>>  #define DT_GNU_PLT     0x6ffffef4  /* Address of PLT section  */
>>  #define DT_GNU_PLTSZ   0x6ffffdf1  /* Size of PLT section  */
>>  #define DT_GNU_PLTENT  0x6ffffdf2  /* Size of one PLT entry  */
>>  #define DT_GNU_PLT0SZ  0x6ffffdf3  /* Size of the first PLT entry  */
>>  #define DT_GNU_PLTGOTSZ 0x6ffffdf4 /* Size of PLTGOT section  */
>>
>> and update DT_FLAGS_1 with:
>>
>>  #define DF_1_JMPRELIGN 0x10000000  /* DT_JMPREL can be ignored  */
>> 2. Linker creates PLT entries for auditing external function calls via
>> GOT and sets DT_GNU_PLT, DT_GNU_PLTSZ, DT_GNU_PLTENT, DT_GNU_PLT0SZ and
>> DT_GNU_PLTGOTSZ.  If PLT isn't required for lazy binding, set the
>> DF_1_JMPRELIGN bit in DT_FLAGS_1.
>> 3. When auditing is enabled at run-time, dynamic linker resolves GLOB_DAT
>> relocation to its corresponding PLT entry by finding JUMP_SLOT relocation
>> against the same function and use its PLT slot as the function address.
>> On x86, the first PLT entry and the 3 GOT slots are reserved.  GOT slot
>> is (JUMP_SLOT relocation offset - DT_PLTGOT) / size of GOT entry.  PLT
>> offset is (GOT slot - 3) * DT_GNU_PLTENT + DT_GNU_PLT0SZ.  PLT address
>> is DT_GNU_PLT + PLT offset.  DT_GNU_PLT, DT_GNU_PLTSZ, DT_PLTGOT and
>> DT_GNU_PLTGOTSZ can be used to check if GOT and PLT offsets are within
>> range.
>> 4. If DF_1_JMPRELIGN is set, dynamic linker can ignore DT_JMPREL when
>> lazy binding is disabled.
>>
>> Any comments?
> 
> Maybe a little more background would help me understand this better,
> but I don't see why the GOT slots aren't being (or couldn't be)
> statically relocated to point to the PLT slots. If the linker does
> that, all the dynamic loader has to do is ignore the JMPREL
> relocations at startup, and let lazy binding happen. I don't see why
> it would need to go through this complicated matching process.

What does "statically relocated" mean?

It appears you are implying the GOT slots for these function calls could
be statically relocated to their respective PLT entries.

This is possible, but you have another problem.

You have a list of GLOB_DAT relocs to process, some of which would overwrite
the statically relocated entries, how do you figure out which these are
and avoid processing them when LD_AUDIT is enabled?
 
> (One trivial comment on your choice of naming: I can't see "JMPRELIGN"
> without reading it as a misspelled "jump re-align"! Maybe "IGN_JMPREL"
> would be better for human readers.)

Agreed.

Cheers,
Carlos.