This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
- From: "Sekhar, Ashwin" <Ashwin dot Sekhar at cavium dot com>
- To: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "siddhesh at gotplt dot org" <siddhesh at gotplt dot org>
- Date: Tue, 13 Jun 2017 08:39:22 +0000
- Subject: Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Ashwin dot Sekhar at cavium dot com;
- References: <20170613071707.43396-1-ashwin.sekhar@caviumnetworks.com> <dc4030b1-54ac-3e80-32ed-5a11153d37be@gotplt.org>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Tue, 2017-06-13 at 14:00 +0530, Siddhesh Poyarekar wrote:
> On Tuesday 13 June 2017 12:47 PM, Ashwin Sekhar T K wrote:
> >
> > The following are the approximate speedups observed over the
> > existing implementation on different Aarch64 platforms for
> > different input values.
> >
> > SINF
> > ---------------------------------------------------------
> > Input ThunderX88 ThunderX99 CortexA57
> > ---------------------------------------------------------
> > 0.0 1.88x 1.18x 1.17x
> > 2.0^-28 1.33x 1.12x 1.03x
> > 2.0^-6 1.48x 1.28x 1.27x
> > 0.6*Pi/4 0.94x 1.14x 1.21x
> > 13*Pi/8 1.41x 2.00x 2.16x
> > 17*Pi/8 1.45x 1.93x 2.23x
> > 1000*Pi/4 19.68x 37.46x 27.99x
> > 2.0^51 12.00x 13.58x 13.49x
> > Inf 1.04x 1.05x 1.12x
> > Nan 0.95x 0.87x 0.82x
> > ---------------------------------------------------------
> >
> > COSF
> > ---------------------------------------------------------
> > Input ThunderX88 ThunderX99 CortexA57
> > ---------------------------------------------------------
> > 0.0 1.25x 1.14x 1.17x
> > 2.0^-28 1.24x 1.14x 1.13x
> > 2.0^-6 1.38x 1.38x 1.85x
> > 0.6*Pi/4 1.15x 1.38x 1.69x
> > 13*Pi/8 1.65x 1.94x 2.18x
> > 17*Pi/8 1.49x 2.05x 2.09x
> > 1000*Pi/4 18.98x 38.39x 27.52x
> > 2.0^51 11.35x 13.74x 13.47x
> > Inf 0.99x 1.02x 1.16x
> > Nan 0.88x 0.86x 0.87x
> > ---------------------------------------------------------
> Thank you for doing this. Please:
>
> 1. Explain why you chose these input values as optimization targets
> and
The algorithm splits the inputs into different intervals and uses
different code paths for these different intervals. The input values I
chose covers all these code paths.
> 2. Write a microbenchmark test for glibc
Sure. Will share this via github.
Thanks
Ashwin
>
> Thanks,
> Siddhesh
>
> >
> >
> > Ashwin Sekhar T K (2):
> > aarch64: Add optimized ASIMD version of sinf
> > aarch64: Add optimized ASIMD version of cosf
> >
> > sysdeps/aarch64/fpu/multiarch/Makefile | 3 +
> > sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S | 367
> > +++++++++++++++++++++++++
> > sysdeps/aarch64/fpu/multiarch/s_cosf.c | 31 +++
> > sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S | 382
> > +++++++++++++++++++++++++++
> > sysdeps/aarch64/fpu/multiarch/s_sinf.c | 31 +++
> > 5 files changed, 814 insertions(+)
> > create mode 100644 sysdeps/aarch64/fpu/multiarch/Makefile
> > create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S
> > create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf.c
> > create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S
> > create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf.c
> >