This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/4] New string function explicit_bzero (from OpenBSD).



On 19/08/2016 10:00, Zack Weinberg wrote:
> On 08/19/2016 08:43 AM, Adhemerval Zanella wrote:
>> On 18/08/2016 17:53, Torvald Riegel wrote:
>>> On Thu, 2016-08-18 at 20:31 +0200, Florian Weimer wrote:
>>>> On 08/17/2016 07:19 PM, Zack Weinberg wrote:
>>>>> +#ifdef __USE_MISC
>>>>> +/* As bzero, but the compiler will not delete a call to this
>>>>> +   function, even if S is dead after the call.  Note: this function
>>>>> +   has its own implementation file and should not be slurped into
>>>>> +   string-inlines.o.  */
>>>>> +__extern_inline void
>>>>> +explicit_bzero (void *__s, size_t __n)
>>>>> +{
>>>>> +  memset (__s, '\0', __n);
>>>>> +  __glibc_read_memory (__s, __n);
>>>>> +}
>>>>> +#endif
>>>>
>>>> __extern_inline can expand to nothing at all, and you would get multiple 
>>>> definitions of explicit_bzero this way.
> 
> OK, what should I be using instead?

Fortify functions uses '__fortify_function' which defines both 
__extern_always_inline and __attribute_artificial__.  However fortify is guarded
by _FORTIFY_SOURCE, which requires GCC 4.1 (so __extern_always_inline will be
defined anyway).

Stdio also defines:

#ifndef __extern_inline
# define __STDIO_INLINE inline
#else
# define __STDIO_INLINE __extern_inline
#endif


#ifdef __USE_EXTERN_INLINES
...
#endif

But I think this will also only add unnecessary complexity on the patch.

Also, another issue with gnu_inline is when building for C++ it might get also
multiple definitions [1].

That's why I think simpler solution would be just to remove this inline
optimization.


[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41194


> 
>>>> I don't think we want explicit_bzero to be inlined, it's useful to have 
>>>> this name in the executable.  Furthermore, we might want to add 
>>>> additional state clearing later, so an implementation in libc.so.6 seems 
>>>> desirable anyway.
> 
> Oddly enough, inlining the memset leads to better security properties
> for generated code.  Remember the "key that lives in the vector
> registers until it has to be copied onto the stack to be erased" scenario?
> 
> ---
> #include <emmintrin.h>
> #include <string.h>
> 
> extern void explicit_bzero (void *ptr, size_t size);
> 
> extern __m128 get_key ();
> extern void encrypt_with_key (__m128 key, void *data, size_t size);
> 
> void encrypt (void *data, size_t size)
> {
>   __m128 key = get_key ();
>   encrypt_with_key (key, data, size);
>   explicit_bzero (&key, sizeof key);
> }
> ---
> 
> (Assume an ABI where __m128 actually gets passed in registers.)
> 
> If you expose the memset to inlining,  the key *won't* be copied into
> the stack and then immediately erased; the compiler will just dutifully
> clear 16 bytes of the stack and then pass them to __glibc_read_memory.
> It still won't be erased from the vector registers, but that can only be
> fixed with compiler support.
> 
> (It also leads to significantly more compact code -- see my experiment
> with OpenSSL from last year -- but that's not as important.)

I also think compact code should not be a blocker here as well.  However I
do not have a better solution for this specific solution without compiler
support.

> 
>>> I suppose we just want a compiler barrier here though, and don't need a
>>> memory barrier in the sense of something that constrains HW reordering.
> 
> That is correct; however, the "memory barrier" I was talking about was
> 
>     asm volatile ("" ::: "memory");
> 
> which is also only a compiler barrier.
> 
>> I would also suggest to avoid adding this inline optimization, it just add
>> another exported symbol by glibc with little performance benefit. 
> 
> The *other* reason why I want to keep the inline optimization is that it
> means I don't need to add __explicit_bzero_chk (which, because of
> ifuncs, means futzing with *every sysdeps definition of memset*!).  So
> either way there's going to be a new exported symbol.
> 
> zw

I do not think this is really required, you can still add a default
debug/explicit_bzero_chk.c which will call explicit_bzero/memset.  Each
port already defines how internal memset is called (either using
internal ifunc or redirecting to a default symbol) so I see no need
to add arch specific implementations.

This does not prevent though to later we add arch-specific optimized
version (which might be ifunc).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]