This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: RFC: Proposal for implementing libmvec on aarch64
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "Ellcey, Steve" <Steve dot Ellcey at cavium dot com>
- Cc: nd <nd at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Tue, 6 Mar 2018 23:21:04 +0000
- Subject: Re: RFC: Proposal for implementing libmvec on aarch64
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- Nodisclaimer: True
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Hi Steve,
I don't believe modifying existing math functions makes sense. This is especially true
if they still use mp logic as that is being actively removed - I have a set of patches for
sin/cos/sincos in testing. Besides that, it's not going to work in general since much of
the code still ends up serialized. Like you've noticed, only vectorizing the polynomials
gives no gain as they take a small part of the time.
So you need a specialized implementation which avoids branches and tables, and
uses much larger polynomials (ie. throughput optimized for common cases but with
significantly higher latency).
Even then, achieving a speedup will be difficult - basically the cost/benefit has changed
radically after Szabolcs' new scalar float math functions. They are now so much more
efficient that even calling the scalar expf 4 times is better than a vectorized expf...
Note a vector ABI is not as simple as a name mangling scheme, you also need to add
support for a different calling standard.
Wilco