This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Fix powerpc fmax, fmin sNaN handling (bug 20947) [committed]
On 16/12/2016 17:40, Steven Munroe wrote:
> On Fri, 2016-12-16 at 16:52 -0200, Adhemerval Zanella wrote:
>>
>> On 16/12/2016 15:41, Joseph Myers wrote:
>>> On Fri, 16 Dec 2016, Steven Munroe wrote:
>>>
>>>>> This patch fixes the powerpc versions of these functions (shared by
>>>>> float and double, 32-bit and 64-bit). The structure of those versions
>>>>> is that all ordered cases are already handled before anything dealing
>>>>> with the case where the arguments are unordered; thus, this patch
>>>>> causes no change to the code executed in the common case (neither
>>>>> argument a NaN).
>>>>>
>>>> This patch, as is, will cause a sever performance regression for
>>>> power6/7/8 and should be fixed or reverted before 2.25 freezes.
>>>
>>> A general assumption in libm functions is that NaN inputs are an uncommon
>>> case and performance for NaN inputs is not of much importance; the code
>>> for NaN inputs should just aim for correctness and simplicity. I think
>>> people calling fmax or fmin generally want the maximum or minimum of two
>>> finite numbers.
>>>
>>> As I said, there is no change to the code executed for the normal case of
>>> non-NaN inputs, so I see no reason for performance in the common case to
>>> be affected. Of course performance improvements for the NaN case are
>>> still welcome (and that sort of small local fix would seem perfectly
>>> suitable to go in during the freeze if architecture maintainers think it
>>> appropriate).
>>>
>>> If the presence of this code for the NaN case somehow affects the
>>> performance of non-NaN case, please let me know; it wasn't meant to do so.
>>>
>>
>> I am running some tests on a powerpc64 box and using a fmax/fmin benchtests
>> with simple real, infinity and {q,s}NaN inputs and the default C
>> implementation is showing better latency in all cases than the assembly
>> ones for powerpc.
>>
>> Also considering that for powerpc64le usually it is target for power8,
>> it allows the compiler to select the best instruction combination for
>> builtin nan/inf/etc, so it allows more rooms for improvement with better
>> compiler support.
>>
>> So my idea for this specific issues would be just to remove the powerpc
>> assembly implementation (I am preparing a patch for it).
>>
>
> Thanks Adhemerval, but which compiler provided these results?
>
> Older compilers might not do as well, in which case backing porting your
> suggested fix to older distros and older GCC would be ill advised.
I used ubuntu 16 default one (gcc 5.4) for this experiment.