This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Optimized with SSE2 sinf and cof for x86_32


This is a patch proposing manually optimized and high-performance sinf
and cosf versions with excellent precision.

Performance on main path [-10000; 10000] is more than  26X better.

Other important intervals are here (ratio of cycles).

      (random)           Ist.   Bulld.   Atom   Neh.   AVX
   cosf    |x|<0.78    1,9    2,72   1,65   1,89   1,79  times
   cosf    |x|<1.57    1,55    1,84   1,75   1,70   1,55  times
   cosf    |x|<2.35    1,64    2,08   1,78   1,75   1,66  times
   cosf    |x|<3.14    1,97    2,86   1,97   1,87   2,12  times
   cosf    |x|<3.92    2,15    3,50   2,08   2,01   2,33  times
   cosf    |x|<4.71    2,29    3,89   2,15   2,07   2,43  times
   cosf    |x|<5.49    2,39    4,70   2,21   2,06   2,52  times
   cosf    |x|<6.28    2,47    4,62   2,25   2,14   2,58  times
   cosf    |x|<7.06    2,54    4,63   2,28   2,16   2,64  times
   cosf    |x|<7.85    2,43    4,48   2,27   2,10   2,63  times
   cosf    |x|<8.63    2,30    4,47   2,23   2,04   2,56  times
   cosf    |x|<9.42    2,21    4,18   2,20   1,99   2,51  times
   cosf    |x|<100     2,53    5,43   2,28   2,34   2,01  times
   cosf    |x|<1000   19,82   20,50  19,88  17,96  18,37 times
   cosf    |x|<10000  25,98  29,78   24,95  23,63  23,52 times
   cosf    |x|<1e10   18,92   28,74   20,97 16,16  18,78 times


   sinf    |x|<0.78    1,39    1,75   1,31   1,30   1,28  times
   sinf    |x|<1.57    1,47    1,78   1,65   1,62   1,67  times
   sinf    |x|<2.35    1,64    2,10   1,77   1,79   1,77  times
   sinf    |x|<3.14    1,94    2,85   1,95   1,88   2,09  times
   sinf    |x|<3.92    2,12    3,38   2,04   1,91   2,30  times
   sinf    |x|<4.71    2,31    3,95   2,14   1,96   2,42  times
   sinf    |x|<5.49    2,66    4,57   2,21   2,15   2,51  times
   sinf    |x|<6.28    2,53    4,67   2,24   2,17   2,56  times
   sinf    |x|<7.06    2,52    4,54   2,23   2,11   2,62  times
   sinf    |x|<7.85    2,43    4,54   2,24   2,08   2,59  times
   sinf    |x|<8.63    2,33    4,62   2,21   2,09   2,53  times
   sinf    |x|<9.42    2,27    4,28   2,17   1,96   2,51  times
   sinf    |x|<100     2,52    5,32   2,26   2,34   2,01  times
   sinf    |x|<1000    20,12   20,42  19,89  18,24  18,48 times
   sinf    |x|<10000   26,26   26,73  25,00  23,11  23,79 times
   sinf    |x|<1e10    18,76   28,73  20,90  16,09  18,49 times



Testing passed for new sinf/cosf with our proprietary test system that
tests on many intervals with different steps, checks for special
values (from ISO C) and corner cases. Test using “make check” from
GLIBC was ok too.

Our test system observed more than 1e4 ulp errors for |x|>1e4 for
current GLIBC. New asm versions, provided here, are maximum 0.500121
ulp for sinf, 0.500573 ulp for cosf.


ChangeLog:

2012-06-22  Liubov Dmitrieva  <liubov.dmitrieva@gmail.com>

	* sysdeps/i386/i686/fpu/multiarch/Makefile: Update
	(sysdep_routines): Add s_sinf-sse2, s_conf-sse2

	* sysdeps/i386/i686/fpu/multiarch/s_sinf-sse2.S New file
	* sysdeps/i386/i686/fpu/multiarch/s_cosf-sse2.S New file
	* sysdeps/i386/i686/fpu/multiarch/s_sinf.c New file
	* sysdeps/i386/i686/fpu/multiarch/s_cosf.c New file
	* sysdeps/ieee754/flt-32/s_sinf.c Update
	(SINF): Add macro for using routine as __sinf_ia32
	* sysdeps/ieee754/flt-32/s_cosf.c Update
	(COSF): Add macro for using routine as __cosf_ia32

	* sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S Fix Copyright
	* sysdeps/i386/i686/fpu/multiarch/e_expf.c Fix Copyright


--
Liubov Dmitrieva

Software Engineer
Intel Corporation

Attachment: sinf_cosf_x86_32.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]