This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Optimized generic expf and exp2f
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, Szabolcs Nagy <Szabolcs dot Nagy at arm dot com>, Joseph Myers <joseph at codesourcery dot com>, "Arjan van de Ven" <arjan at linux dot intel dot com>
- Cc: nd <nd at arm dot com>
- Date: Wed, 6 Sep 2017 12:36:42 +0000
- Subject: Re: [PATCH] Optimized generic expf and exp2f
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- Nodisclaimer: True
- References: <DB6PR0801MB2053A12C3A4F7032C0107E1683960@DB6PR0801MB2053.eurprd08.prod.outlook.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Szabolcs Nagy wrote:
> On 05/09/17 21:58, Joseph Myers wrote:
> > On Tue, 5 Sep 2017, Arjan van de Ven wrote:
>
>>> you mentioned x86 data.. is that based on current git after
>>> the recent optimizations (on a cpu with fma)?
>
>> Really you need to compare with both the fma and non-fma versions (and
>> compare the C version built both with and without fma, since one
>> possibility would be that the C version can replace the x86_64 ones but
>> should be built twice, with and without fma, to achieve that replacement).
My machine has AVX2 and FMA, and when building the new generic expf
with -mavx2 -mfma I get:
expf reciprocal-throughput: 1.5x faster
expf latency: 1.4x faster
I verified in both cases FMA was used.
Wilco