This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] explicit_bzero final


On 12/13/2016 06:04 PM, Zack Weinberg wrote:
On Tue, Dec 13, 2016 at 5:46 PM, Jeff Law <law@redhat.com> wrote:
On 12/13/2016 08:53 AM, Zack Weinberg wrote:

It's probably good to loop in some GCC people, but off the top of my
head, I think it should be sufficient to have it treat a call to
'__glibc_read_memory' as triggering the same special behavior as a
call to 'explicit_bzero'.  Of course that would mean
__glibc_read_memory can't be used for other cases where we need a
"those writes are not dead" optimization fence, so maybe I should
rename it __explicit_bzero_fence or something like that.

It's going to be hard for GCC to recognize the memset_chk+read_memory as an
explict_bzero.  It'll have to infer it from the code pattern, which is
inherently prone to misses.

Maybe I don't understand what you have in mind, but I don't think you
have to infer anything from the code pattern.  Just treat
__explicit_bzero_fence the same as explicit_bzero itself.  The
arguments are the same and it won't be used for anything else.

I have no idea how hard this would be to implement in present-day GCC,
but the way I imagine explicit_bzero support working is: there's an
__attribute__ for variables (let's call it "sensitive") that triggers
a forward taint analysis: every data object transitively data or
control dependent on the "sensitive" variable also becomes
"sensitive".  This has no effect on data objects whose lifetime
extends beyond the function in which they were declared, but if they
don't, they are erased when their lifetime ends.
The concept of tainting variables and tracking that property seems distinct from making sure that the compiler properly handles explicit_bzero. And just to be clear, I see that property as having value.




Then, the effect of the explicit_bzero and __explicit_bzero_fence
builtins just is to tag the variable whose address they receive with
the "sensitive" attribute; the explicit_bzero builtin also behaves as
__builtin_memset(s, 0, n) (this is in case the sensitive variable is
allocated on the heap or otherwise survives -- it still needs to be
erased when the programmer said it should be erased).  This might need
some sort of _backward_ data flow analysis to figure out which
source-code variable corresponds to the SSA name received by the
builtin, I dunno.
Interesting. I hadn't thought about using these builtins to seed the dataflow analysis.



There's probably some wacky corner cases I haven't thought of, but I
don't see why in-glibc expansion of explicit_bzero to memset+fence
should be a problem.
My worry is ensuring that the compiler doesn't ever remove these stores as dead. And the easiest way I see to do that is to recognize the explicit_bzero explicitly. Otherwise I have to walk the use->def chain to pair the fence with the bzero and then tag the bzero. While that ought to work, it just seems easier to recognize the bzero from the start as special.



I'd also worry that this style definition may be prone to goofs.  The keys
being that if LTO were to see inside __glibc_read_memory and realize that
the reads are dead and removes the reads.

Right now, LTO into glibc will have all kinds of other negative
consequences (Joseph can explain in more detail) and we don't even try
to support it.  I've carefully documented in the source code that, if
we ever do try to support LTO, __glibc_read_memory (/
__explicit_bzero_fence) must remain opaque to the compiler.
Understood, WRT LTO today. I'm not worried about today, I'm worried about what happens in the future when the assumptions we're making today about what the compiler can and can not do may no longer hold.
Jeff

zw



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]