This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Inline C99 math functions
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, libc-alpha at sourceware dot org
- Date: Thu, 09 Jul 2015 11:35:41 -0400
- Subject: Re: [PATCH] Inline C99 math functions
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 10 dot 1506151431490 dot 26683 at digraph dot polyomino dot org dot uk> <001701d0a789$f2ab86f0$d80294d0$ at com> <20150615185201 dot GA3023 at domone> <alpine dot DEB dot 2 dot 10 dot 1506152127340 dot 9772 at digraph dot polyomino dot org dot uk> <20150616050045 dot GA8021 at domone> <55801706 dot 4010109 at linaro dot org> <20150616134331 dot GA7016 at domone> <55807EFB dot 8090702 at redhat dot com> <20150703084056 dot GD32307 at domone> <55969C28 dot 5060803 at redhat dot com> <20150709150244 dot GB18030 at domone>
On 07/09/2015 11:02 AM, OndÅej BÃlka wrote:
>> I never said it was your responsibility. It is the responsibility of the person
>> submitting the patch to provide an objective description of how they verified
>> the performance gains. It has become standard practice to recommend the author
>> of the patch contribute a microbenchmark, but it need not be a microbenchmark.
>> The author does need to provide sufficient performance justification including
>> the methods used to measure the performance gains to ensure the results can
>> be reproduced.
>>
> Correct. Here there are plenty of things that could go wrong so I need
> to point that out.
Agreed. I would like to see us create a "Measuring Performance" wiki page
with what the community expects. I think you would have a lot to contribute
to such a document.
> Only benchmark where I would be certain with result would be take one of
> numerical applications where is* is bottleneck that were mentioned
> before and measure performance before and after.
That is a good recommendation. When I worked in HPC the common phrase was
"the only benchmark that matters is your application." :-)
In glibc this is harder because we are looking for average-case performance
across all applications. In some ways the optimization space will get both
harder and easier as Siddhesh works on tunnables. Applications will be able
to adjust runtime behaviour for performance. However, untuned general performance
will still be just as difficult to optimze for. And then we get to talk about
auto-tunning algorithms, and tunnings for given workloads.
>> In this case if you believe catan can be used to test the performance in this
>> patch, please suggest this to Wilco. However, I will not accept performance
>> patches without justification for the performance gains.
>>
> Already wrote about that to get more precise answer.
Thanks.
>>> Carlos you talk lot about deciding objectively but when I ask you out
>>> its never done. So I will ask you again to decide based on my previous
>>> benchmark. There sometimes builtin is 20% faster and sometimes a current
>>> inline is 20% faster. How do you imagine that experts would decide
>>> solely on that instead of telling you that its inconclusive and you need
>>> to do real world measurements or that benchmark is flawed because X?
>>
>> My apologies if I failed to do something you requested.
>>
>> I'm not interested in abstract examples, I'm interested in the patch being
>> submitted by ARM. I will continue this discussion on the downstream thread
>> that includes the microbenchmark written by Wilco.
>>
>> In general I expect there are certainly classes of performance problems
>> that have inconclusive results. In which case I will reject that patch until
>> you tell me how you measure the performance gains and what you expect the
>> performance gains to be on average.
>>
> Problem is that most functions have inconclusive results as we use mined
> cases where its possible and must use assumptions about input distribution to get speedup.
> Simplest example is that in string functions a we assume that 64 bytes
> cross page boundary only rarely and list of these assumptions goes on
> and on. Then when microbenchmark violates one of these as its simplistic
> truth is if that gives real speedup of programs not what microbenchmark
> shows.
I don't follow. Could you expand on your thought there?
Cheers,
Carlos.