This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/N] x86_64 vectorization support: vectorized math functions addition to Glibc


On Thu, Sep 11, 2014 at 10:33:41PM -0700, Andi Kleen wrote:
> Rich Felker <dalias@libc.org> writes:
> >
> > This really seems like something the compiler should be doing --
> > translating parallelizable calls to the standard math functions into
> > calls to special simd versions (
> 
> Of course gcc already supports that. Even in two different flavours.
> 
> Not sure why the patch doesn't implement one of those ABIs though.
> 
>      -mveclibabi=type
>            Specifies the ABI type to use for vectorizing intrinsics
>            using an external library.
>            Supported values for type are svml for the Intel short vector
>            math library and acml for
>            the AMD math core library.  To use this option, both
>            -ftree-vectorize and
>            -funsafe-math-optimizations have to be enabled, and an SVML
>            or ACML ABI-compatible
>            library must be specified at link time.
> 
>            GCC currently emits calls to "vmldExp2", "vmldLn2",

Which has problem when one want to support both users with svml, amcl or
nothing package maintainers for some reason do not want create three
versions of same package.

What about doing runtime detection what is present? With ifunc one could
make use logic like

int vectorized;
function_ifunc ()
{
  if (!(svml = dlopen ("svml.so")))
    {
      if (!(amcl = dlopen ("amcl.so")))
        return function;
      vec_exp = dlsym (amcl, "__vrd2_exp");
      return function;
    }
  else
    {
      vec_exp = dlsym (svml, "vmldExp2");
      return function;
    }
}

when vectorized loop could look like

if (size < 4 || !vec_exp)
  goto simple_loop;
else
  goto vector_loop;

That would also preserve compatibility and allow to add avx versions
with detection if processor supports them.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]