This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
- From: "Sekhar, Ashwin" <Ashwin dot Sekhar at cavium dot com>
- To: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "szabolcs dot nagy at arm dot com" <szabolcs dot nagy at arm dot com>
- Cc: "nd at arm dot com" <nd at arm dot com>
- Date: Tue, 13 Jun 2017 12:56:31 +0000
- Subject: Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Ashwin dot Sekhar at cavium dot com;
- References: <20170613071707.43396-1-ashwin.sekhar@caviumnetworks.com> <593FC77A.6050609@arm.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
> > SINF
> > ---------------------------------------------------------
> > Input ThunderX88 ThunderX99 CortexA57
> > ---------------------------------------------------------
> > 0.0 1.88x 1.18x 1.17x
> > 2.0^-28 1.33x 1.12x 1.03x
> > 2.0^-6 1.48x 1.28x 1.27x
> > 0.6*Pi/4 0.94x 1.14x 1.21x
> > 13*Pi/8 1.41x 2.00x 2.16x
> > 17*Pi/8 1.45x 1.93x 2.23x
> based on these numbers my current c implementation is faster,
> but it will take time to polish that for submission.
Are these going to be aarch64 specific C implementations or changes in
generic code?
And Could you please inform when you are going to submit your patches.
I also dont agree to having duplicated efforts. But if you dont plan to
submit your changes in the near future, I guess I will go ahead
addressing the other comments and work on submitting a v2 patch.
Thanks
Ashwin
>
> >
> > 1000*Pi/4 19.68x 37.46x 27.99x
> > 2.0^51 12.00x 13.58x 13.49x
> this is a bug in the current generic code that it falls back
> to slow argument reduction even though single precision arg
> reduction can be done in a few cycles over the entire range,
>
> i think the x86_64 sse code could still be simpler and faster
> (not that it matters much as these are rare cases).
>
> >
> > Inf 1.04x 1.05x 1.12x
> > Nan 0.95x 0.87x 0.82x
> > ---------------------------------------------------------