This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Subject: single-precision "expf" super slow on x86-64??

Hi, I've run into the following very odd behavior:

On my debian x86-64 system, the single-precision "expf" function seems
to be about six times _slower_ than the double-precision "exp"

Moreover, "expf" seems to be far slower than other "slow" math
functions (sin, cos, etc) while "exp" seems to be roughly the same
speed as them.

I first noticed this in a real program, which seemed oddly slow, and
got it to speed up significantly by using exp instead of expf (even
though the values being manipulated are all single-precision, so I
don't need exp).

I've duplicated this in the attached test program; here's the output:

   $ gcc-4.7 -o exp-test -O2 -march=native exp-test.c -lm

   $ ./exp-test
   fisum = 4.99891e+06, disum = 4.99891e+06
   fosum = 1.71808e+07, dosum = 1.71808e+07
   float (expf) user time: 2.94819 sec
   double (exp) user time: 0.520032 sec

The first two lines of output just verify that the results are the
same for float and double calculations.  The second two lines show the
user CPU times (via getrusage), showing that "expf" takes about six
times as long as "exp"...

The glibc version is 2.13, the compiler is gcc 4.7 (compiling for
x86-64), and the CPU is an AMD Phenom.

Anybody have any idea what's going on?  This behavior seems very

[As a workaround, I could modify my program to just use "exp", even on
single-precision values, but this seems a fragile hack, and detecting
this odd situation with autoconf seems ... annoying...]



#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/resource.h>

#define NITERS 10000000

float finps[NITERS], foutps[NITERS];
double dinps[NITERS], doutps[NITERS];

int main ()
  struct rusage start_float_ru, end_float_ru;
  struct rusage start_double_ru, end_double_ru;
  int i;

  for (i = 0; i < NITERS; i++)
    dinps[i] = drand48() + 1e-6;
  for (i = 0; i < NITERS; i++)
    finps[i] = (float) (dinps[i]);

  getrusage (RUSAGE_SELF, &start_float_ru);
  for (i = 0; i < NITERS; i++)
    foutps[i] = expf (finps[i]);
  getrusage (RUSAGE_SELF, &end_float_ru);

  getrusage (RUSAGE_SELF, &start_double_ru);
  for (i = 0; i < NITERS; i++)
    doutps[i] = exp (dinps[i]);
  getrusage (RUSAGE_SELF, &end_double_ru);

  double fisum = 0;
  for (i = 0; i < NITERS; i++)
    fisum += finps[i];

  double disum = 0;
  for (i = 0; i < NITERS; i++)
    disum += dinps[i];

  double fosum = 0;
  for (i = 0; i < NITERS; i++)
    fosum += foutps[i];

  double dosum = 0;
  for (i = 0; i < NITERS; i++)
    dosum += doutps[i];

  printf ("fisum = %g, disum = %g\n", fisum, disum);
  printf ("fosum = %g, dosum = %g\n", fosum, dosum);

  printf ("float (expf) user time: %g sec\n",
		    - start_float_ru.ru_utime.tv_sec)
		     - start_float_ru.ru_utime.tv_usec) / 1.e6));
  printf ("double (exp) user time: %g sec\n",
		    - start_double_ru.ru_utime.tv_sec)
		     - start_double_ru.ru_utime.tv_usec) / 1.e6));

  return 0;
Corporation, n. An ingenious device for obtaining individual profit without
individual responsibility.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]