This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] x86-64: Add sinf with FMA

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
Cc: Joseph Myers <joseph at codesourcery dot com>, GNU C Library <libc-alpha at sourceware dot org>
Date: Tue, 5 Dec 2017 08:56:35 -0800
Subject: Re: [PATCH] x86-64: Add sinf with FMA
Authentication-results: sourceware.org; auth=none
References: <20171204180905.GA31592@gmail.com> <3c53189f-818f-0473-9ccd-1c0ecf40ab1c@linaro.org> <CAMe9rOr5D6PVdpJq1QGRRy99dRCNZQEYHKcjXHKBsTF5Hu1W4A@mail.gmail.com> <alpine.DEB.2.20.1712041856270.16372@digraph.polyomino.org.uk> <CAMe9rOqueisVwBqojRfbibcpKU=tOLQYVfvgrWcG30tSFmfd=Q@mail.gmail.com> <alpine.DEB.2.20.1712042048360.1303@digraph.polyomino.org.uk> <CAMe9rOqddBsGz3Fwr4Vtk094P5Ywyz6P7+a9veUaNL6FbLA5eQ@mail.gmail.com> <a505c43a-1a44-34fc-f36b-243e328b34af@linaro.org>

On Tue, Dec 5, 2017 at 5:47 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:

> And with a simple modification to avoid int to fp conversion:
>
> ---
> diff --git a/sysdeps/ieee754/flt-32/s_sinf.c b/sysdeps/ieee754/flt-32/s_sinf.c
> index 40d3d19..a2fd3cf 100644
> --- a/sysdeps/ieee754/flt-32/s_sinf.c
> +++ b/sysdeps/ieee754/flt-32/s_sinf.c
> @@ -75,7 +75,7 @@ static const double invpio4_table[] = {
>    0x1.0e4107cp-169
>  };
>
> -static const int ones[] = { +1, -1 };
> +static const double ones[] = { 1.0, -1.0 };
>
>  /* Compute the sine value using Chebyshev polynomials where
>     THETA is the range reduced absolute value of the input
> @@ -92,7 +92,7 @@ reduced (const double theta, const unsigned long int n,
>    const double theta2 = theta * theta;
>    /* We are operating on |x|, so we need to add back the original
>       signbit for sinf.  */
> -  int sign;
> +  double sign;
>    /* Determine positive or negative primary interval.  */
>    sign = ones[((n >> 2) & 1) ^ signbit];
>    /* Are we in the primary interval of sin or cos?  */
> ---
>
> I get:
>
>   "sinf": {
>    "": {
>     "duration": 4.0015e+10,
>     "iterations": 1.4535e+09,
>     "max": 640.456,
>     "min": 11.437,
>     "mean": 27.5301
>    }
>
> Which is roughly 3% on mean and 11.5% on min. I think we can improve it
> even more by avoiding the int to fp conversion to get the sign right
> and try operate with sign as double argument.

I tried it on Skylake with the current master.  Before:

  "sinf": {
   "": {
    "duration": 3.4044e+10,
    "iterations": 1.9942e+09,
    "max": 141.106,
    "min": 7.704,
    "mean": 17.0715
   }
  }

After:

  "sinf": {
   "": {
    "duration": 3.40665e+10,
    "iterations": 2.03199e+09,
    "max": 95.994,
    "min": 7.704,
    "mean": 16.765
   }
  }

Generic is faster than asm now:

  "sinf": {
   "": {
    "duration": 3.40417e+10,
    "iterations": 1.87792e+09,
    "max": 138.868,
    "min": 8.546,
    "mean": 18.1273
   }
  }

Can you submit your patch?

Thanks.

-- 
H.J.

Follow-Ups:
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: Adhemerval Zanella

References:
- [PATCH] x86-64: Add sinf with FMA
  - From: H.J. Lu
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: Adhemerval Zanella
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: H.J. Lu
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: Joseph Myers
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: H.J. Lu
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: Joseph Myers
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: H.J. Lu
- Re: [PATCH] x86-64: Add sinf with FMA
  - From: Adhemerval Zanella

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]