This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf

From: "Sekhar, Ashwin" <Ashwin dot Sekhar at cavium dot com>
To: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "siddhesh at gotplt dot org" <siddhesh at gotplt dot org>
Date: Tue, 13 Jun 2017 08:39:22 +0000
Subject: Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
Authentication-results: sourceware.org; auth=none
Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Ashwin dot Sekhar at cavium dot com;
References: <20170613071707.43396-1-ashwin.sekhar@caviumnetworks.com> <dc4030b1-54ac-3e80-32ed-5a11153d37be@gotplt.org>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

On Tue, 2017-06-13 at 14:00 +0530, Siddhesh Poyarekar wrote:
> On Tuesday 13 June 2017 12:47 PM, Ashwin Sekhar T K wrote:
> > 
> > The following are the approximate speedups observed over the
> > existing implementation on different Aarch64 platforms for
> > different input values.
> > 
> >   SINF
> >   ---------------------------------------------------------
> >   Input           ThunderX88      ThunderX99      CortexA57
> >   ---------------------------------------------------------
> >   0.0              1.88x           1.18x           1.17x
> >   2.0^-28          1.33x           1.12x           1.03x
> >   2.0^-6           1.48x           1.28x           1.27x
> >   0.6*Pi/4         0.94x           1.14x           1.21x
> >   13*Pi/8          1.41x           2.00x           2.16x
> >   17*Pi/8          1.45x           1.93x           2.23x
> >   1000*Pi/4       19.68x          37.46x          27.99x
> >   2.0^51          12.00x          13.58x          13.49x
> >   Inf              1.04x           1.05x           1.12x
> >   Nan              0.95x           0.87x           0.82x
> >   ---------------------------------------------------------
> > 
> >   COSF
> >   ---------------------------------------------------------
> >   Input           ThunderX88      ThunderX99      CortexA57
> >   ---------------------------------------------------------
> >   0.0              1.25x           1.14x           1.17x
> >   2.0^-28          1.24x           1.14x           1.13x
> >   2.0^-6           1.38x           1.38x           1.85x
> >   0.6*Pi/4         1.15x           1.38x           1.69x
> >   13*Pi/8          1.65x           1.94x           2.18x
> >   17*Pi/8          1.49x           2.05x           2.09x
> >   1000*Pi/4       18.98x          38.39x          27.52x
> >   2.0^51          11.35x          13.74x          13.47x
> >   Inf              0.99x           1.02x           1.16x
> >   Nan              0.88x           0.86x           0.87x
> >   ---------------------------------------------------------
> Thank you for doing this.  Please:
> 
> 1. Explain why you chose these input values as optimization targets
> and
The algorithm splits the inputs into different intervals and uses
different code paths for these different intervals. The input values I
chose covers all these code paths.

> 2. Write a microbenchmark test for glibc
Sure. Will share this via github.

Thanks 
Ashwin

> 
> Thanks,
> Siddhesh
> 
> > 
> > 
> > Ashwin Sekhar T K (2):
> >   aarch64: Add optimized ASIMD version of sinf
> >   aarch64: Add optimized ASIMD version of cosf
> > 
> >  sysdeps/aarch64/fpu/multiarch/Makefile       |   3 +
> >  sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S | 367
> > +++++++++++++++++++++++++
> >  sysdeps/aarch64/fpu/multiarch/s_cosf.c       |  31 +++
> >  sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S | 382
> > +++++++++++++++++++++++++++
> >  sysdeps/aarch64/fpu/multiarch/s_sinf.c       |  31 +++
> >  5 files changed, 814 insertions(+)
> >  create mode 100644 sysdeps/aarch64/fpu/multiarch/Makefile
> >  create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf-asimd.S
> >  create mode 100644 sysdeps/aarch64/fpu/multiarch/s_cosf.c
> >  create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf-asimd.S
> >  create mode 100644 sysdeps/aarch64/fpu/multiarch/s_sinf.c
> >

Follow-Ups:
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Sekhar, Ashwin
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Siddhesh Poyarekar

References:
- [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Ashwin Sekhar T K
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Siddhesh Poyarekar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]