This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] ifunc suck, use ufunc.


On Mon, May 25, 2015 at 04:21:57PM -0300, Adhemerval Zanella wrote:
> 
> 
> On 25-05-2015 16:01, OndÅej BÃlka wrote:
> > On Mon, May 25, 2015 at 03:11:35PM -0300, Adhemerval Zanella wrote:
> >>
> >>
> >> On 25-05-2015 10:49, OndÅej BÃlka wrote:
> >>> On Mon, May 25, 2015 at 09:17:48AM -0300, Adhemerval Zanella wrote:
> >>>>
> >>>> Although the reason for such mechanism seems reasonable, I do not see a
> >>>> good approach to add more architecture specific behaviour on GLIBC. 
> >>>> IFUNC is already not really on all platforms and it has its own 
> >>>> idiosyncrasies for some ports (like hwcap init order).
> >>>>
> >>> While it isn't wise to add another mechanism it is necessary. If you
> >>> have program where in 50% of call sites implementation A is better while
> >>> in other half B is better how would you resolve that? No matter how you
> >>> select you would get bad performance (and resolving selection on
> >>> backtrace would likely cost more than savings.)
> >>>
> >>> Also you would lose performance from not being able to use precomputed
> >>> tables.
> >>>
> >>> You could use ifunc with losing per-call site information. However
> >>> ifuncs would get ugly not simple if you have feature X select
> >>> implementation using X in order from newest to oldest.
> >>>
> >>> You would need to at start of program read file and you would must
> >>> decide to select function according to profile from file. You should
> >>> probably do same with ufuncs but these provide some stack.
> >>>
> >>> An aim would be replace ifuncs with ufunc or some variant. For that they
> >>> would need to coexist for some time until they are converted.
> >>>
> >>> One side benefit is that you wont need add ifunc for new architectures,
> >>> just use ufuncs. One problem would be cost, as mentioned before without
> >>> harware atomic write it could be too costy. One possibility that I
> >>> mentioned was force double compilation on these arch so you could do
> >>> resolving for given cpu and then distribute separate binaries with each
> >>> cpu data. 
> >>
> >> I understood the problem, but I see your initial idea troublesome. As pointed
> >> out by Szabolcs, there is security issues about adding the function pointer
> >> information in read-write segments (which recent changes to binutils seems
> >> to avoid with copy segments and relro), it has potentially issues on debugging
> >> and profiling tools, it relies on memory semantics that is too x86_64 centric, 
> >> among others.
> >>
> > As i wrote before thats solved problem. Use PTR_MANGLE/DEMANGLE equivalent which 
> > harms performance a bit. Also what is problem with debugging? Now
> > steping to string function would send you to random assembly so how
> > would adding ufunc different?
> 
> My understanding is PTR_MANGLE/DEMANGLE is somewhat a hack for the problem
> where you must have a RW segment where function points reside.  Recent binutils
> options are aiming to exactly *avoid* this, since if you know the memory
> where the seed reside you can read and defeat the mangle/demangle (although
> it will require a more cleaver thread).  It also defeat another optimization with
> relro which is mark the page as RO and thus allowing more sharing.
>
First question how do you make page with ifunc result sharable, its same
situation?

My main problem is how to make these easy to use. If we mandate that
user needs instead writting make do

UFUNC_PROFILE=1 make; run program; make 

Then I can offer better solution. Main problem is cpp limitation that I
cant create counter in macro. Second limitation is collecting profiler
data.

With these two conditions I don't need lazy resolution. Just put all
these in one table aligned to page boundary, write a constructor that
resolves all ufuncs in dso and when finishes it mprotects it to be
read-only.

It would be bit ugly as I need to use macro like
 (__LINE__ == 324) ? 0 :
 (__LINE__ == 333) ? 1 : 
to find index and verify that table is correct.
  

 
> The problem which debugging imho is now you will have the teach debuggers and
> profilers that some strings functions for some specific arch do no follow any
> ELF defined way.
>
Still don't see it. Unless you start using assembly tricks gdb handles
function pointers just fine. With assembly you need to stepi to get into
these. And how its different than debuging with FORCE_INLINES where
these are replaced by ineffective inline assembly?

Profiling is relatively easy but technical problem. You only need to
check something like

  if (function != dlsym_real("function"))
    f->fn = function_wrapper;

where function_wrapper will properly call function. Only dlsym wouldn't
be enough due plt indirection. We would need to provide one that
provides actual pointer. This could be done.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]