This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] ifunc suck, use ufunc.


Hi, header cleanup reminded me a project in my backlog.
In short ifunc approach is inadequate for optimization.
There are numerous problems, main one is lack of granularity.
In different applications a different variant could be optimal, even
different call sites in same application could differ.

Then there is problem that selection happens too early, an optimal
variant is decided by runtime profiling and you cannot change that.

Simple example of problem is if say strncpy is called once per program
lifetime you waste lot of cycles by fetching implementation to cache.

A better approach would be to have small implementation and switch to
bigger one once you used it at least say 20 times.

So if we found that ifunc is problem not solution an solution is
suprisingly simple:

Do resolution in the userspace. It would also be faster as it saves
cycles on useless plt indirection.

This is my plan for string functions, essence is use following pattern
with actual resolution instead dlsym(RTLD_DEFAULT,"memset")

A main benefit would be interlibrary constant folding. Why waste cycles
on reinitializing constant, just save it to ufunc structure. Resolver 
then could precompute tables to improve speed.

As interposing these you would need to interpose resolver.

An gcc support is not needed but we could get something with alternate
calling convention as passing resolver struct is common and could be
preserved for loops with tail calls.

A future direction could be replace plt and linker with ufunc, it would
require adding function string pointer to structure and calling first
generic resolver to select specific resolver.

Comments?

An simplified example is here (for arch that pass four arguments in registers):

struct memset_ufunc
{
  void *(*fn)(void *, int, size_t, struct memset_ufunc *);
  char  __attribute__((aligned (32))) data[32];
};
# define memset(s, c, n) \
   (__extension__({ \
    static struct memset_ufunc __resolve = {.fn = memset_resolver}; \
    __resolve.fn (s, c, n, &__resolve); \
    }))

void *
memset_resolver (void *s, int c, size_t n, struct memset_ufunc *resolve)
{
  resolve->fn = dlsym (RTLD_DEFAULT, "memset");
  return resolve->fn (s, c, n, resolve);
}

void foo (char *c);
int main ()
{
  char u[32];
  memset (u, 1, 32);
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]