This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Alternative libio vtable hardening approach


On 06/01/2016 05:12 AM, Florian Weimer wrote:
>> (b) The table is always PC-relative addressable from ROP gadgets
>>     within the library so finding it is easy.
>>     - Hiding the pointers with mangling doesn't make this harder.
> 
> I don't get the âscanningâ and âROPâ parts. :)

Assume you want to remap the table as read-write.
Assume you want to do so via a ROP gadget.
Assume the gadget can do PC-relative + offset loads.
Assume a gadget that can remap a region of memory as read-only.

An attacker must find the table before they can modify it.
Any changes to the glibc binaries change the offset in a PC+offset
equation being used by the attacker to find the table. The goal being
to remap the table as read-write to modify the pointers. If you can't
find the table your only alternative is to scan known places for the
table and attempt to find the start or the end. You might conduct this
scanning by looking for JUMP_INIT_DUMMY holes at some fixed intervals.
If you could obscure that detection then it makes finding the table
hard.

I think it is probably easier to just shotgun the handful of offsets
if you can use mprotect to change the protections of the table. If you
can already use mprotect, perhaps you have enough control to attack
other things. Thus hiding the table therefore doesn't make a big
difference?

>> (c) Mangling degrades performance.
> 
> Not just that, it makes compatibility so much harder. (We would have
> to unmangle vtables if needed.)

Good point.

>>> I'm only worried about the flag that turns of the check.
>>
>> I don't see any way to provide backwards compatibility than to disable
>> the check with a global.
>>
>> We could encrypt the value of the global to make it harder to unset?
>>
>> Mike suggests this too.
> 
> Yes, it's a good idea.

Awesome.

>>>> What we really want is a set of internal initializers that are run during
>>>> static dlopen / dlmopen that allow the library to copy key state into the
>>>> other namespace.
>>>
>>> We don't need pass across any actual data, or affect relocation, so
>>> it's easier. malloc in libc.so does the following to detect whether
>>> it's running across a linking boundary:
>>>
>>> #ifdef SHARED
>>>   /* In case this libc copy is in a non-default namespace, never use brk.
>>>      Likewise if dlopened from statically linked program.  */
>>>   Dl_info di;
>>>   struct link_map *l;
>>>
>>>   if (_dl_open_hook != NULL
>>>       || (_dl_addr (ptmalloc_init, &di, &l, NULL) != 0
>>>           && l->l_ns != LM_ID_BASE))
>>>     __morecore = __failing_morecore;
>>> #endif
>>
>> Ah, OK, that answers my question. You lookup your own link map and then
>> determine you're not in the default namespace and reset the variables.
>> That's a sufficient solution. Thanks.
> 
> I also found this, which seems to cover the same thing:
> 
> /* Set nonzero if we have to be prepared for more than one libc being
>    used in the process.  Safe assumption if initializer never runs.  */
> int __libc_multiple_libcs attribute_hidden = 1;
> 
> The malloc code doesn't use it, maybe because it came from out-of-tree ptmalloc.
> 
> Would be csu/init-first.c the right place to perform the
> multiple-libc check and set the flag? Does it run for both static and
> dynamic binaries?

No, I would use csu/libc-start.c since static or dynamic you always
call __libc_start_main.

Though I think we might have already shown that this method doesn't
work in some of the cases you want to cover, but I wanted to respond
to the question here with a clear answer.

>> Again, my thoughts were along the lines of some kind of callback
>> registration process where the old structure pointers are registered
>> along with a callback.
> 
> Based on our experience with the fork handlers, I think explicit
> calls across subsystems are the way to go. It's the best way to
> express ordering requirements.

Would this mean that we have to use "init on first use" for all of
the things in the new dynamic namespace? Without the registered
callback we need to call across subsystems right when we need it.

>>>      a4c:       48 8b 83 d8 00 00 00    mov    0xd8(%rbx),%rax
>>>      a53:       48 29 f2                sub    %rsi,%rdx
>>>      a56:       48 81 78 38 00 00 00    cmpq $_IO_file_xsputn_fast,0x38(%rax)
>>>      a5e:       0f 85 e6 11 00 00       jne    1c4a
>>>      a64:       e8 00 00 00 00          callq  _IO_file_xsputn_fast
>>> â
>>>     1c4a:       e8 00 00 00 00          callq  _IO_file_xsputn_slow
>>>     1c4f:       e9 15 ee ff ff          jmpq   a69
>>>
>>> This expansion has much more reasonable size, but only makes sense if
>>> the call site is monomorphic in practice.
>>
>> I don't follow. The slow path can still use the vtables as required by
>> legacy binaries and the call site there can be polymorphic? Maybe I've
>> misunderstood the suggestion.
> 
> It's just a matter of performance. The comparison against the
> expected pointer is worthwhile if the in almost all cases, there is a
> match and you hit the fast path. If it's just 50/50 or something like
> that (as it currently would be because fprintf and sprintf use
> different xsputn implementations), it does not improve performance.
> Calling the slow path directly would be preferable.

OK.

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]