This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf


This patchset adds the optimized ASIMD version of sinf/cosf
for Aarch64. The algorithm and code flow is based on the SSE versions
of the same in sysdeps/x86_64/fpu.

The ASIMD versions are used only if the cpu supports asimd feature.
It uses ifuncs and HWCAP to identify the ASIMD capability.

The patchset was tested using "make check" for the math sub-directory.
The tests were run on linux 4.4.0-45-generic on ThunderX88 platform.

The following are the approximate speedups observed over the
existing implementation on different Aarch64 platforms for
different input values.

  SINF
  ---------------------------------------------------------
  Input           ThunderX88      ThunderX99      CortexA57
  ---------------------------------------------------------
  0.0              1.88x           1.18x           1.17x
  2.0^-28          1.33x           1.12x           1.03x
  2.0^-6           1.48x           1.28x           1.27x
  0.6*Pi/4         0.94x           1.14x           1.21x
  13*Pi/8          1.41x           2.00x           2.16x
  17*Pi/8          1.45x           1.93x           2.23x
  1000*Pi/4       19.68x          37.46x          27.99x
  2.0^51          12.00x          13.58x          13.49x
  Inf              1.04x           1.05x           1.12x
  Nan              0.95x           0.87x           0.82x
  ---------------------------------------------------------

  COSF
  ---------------------------------------------------------
  Input           ThunderX88      ThunderX99      CortexA57
  ---------------------------------------------------------
  0.0              1.25x           1.14x           1.17x
  2.0^-28          1.24x           1.14x           1.13x
  2.0^-6           1.38x           1.38x           1.85x
  0.6*Pi/4         1.15x           1.38x           1.69x
  13*Pi/8          1.65x           1.94x           2.18x
  17*Pi/8          1.49x           2.05x           2.09x
  1000*Pi/4       18.98x          38.39x          27.52x
  2.0^51          11.35x          13.74x          13.47x
  Inf              0.99x           1.02x           1.16x
  Nan              0.88x           0.86x           0.87x
  ---------------------------------------------------------

Ashwin Sekhar T K (2):
  aarch64: Add optimized ASIMD version of sinf
  aarch64: Add optimized ASIMD version of cosf

 sysdeps/aarch64/fpu/multiarch/Makefile       |   3 +
 sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S | 367 +++++++++++++++++++++++++
 sysdeps/aarch64/fpu/multiarch/s_cosf.c       |  31 +++
 sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S | 382 +++++++++++++++++++++++++++
 sysdeps/aarch64/fpu/multiarch/s_sinf.c       |  31 +++
 5 files changed, 814 insertions(+)
 create mode 100644 sysdeps/aarch64/fpu/multiarch/Makefile
 create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S
 create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf.c
 create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S
 create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf.c

-- 
2.12.2


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]