Bug 16510 - [i386] x87 version of floor(double) is used in 2.17 for i386 even if -mfpmath=sse is specified
Summary: [i386] x87 version of floor(double) is used in 2.17 for i386 even if -mfpmath...
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: math (show other bugs)
Version: 2.19
: P2 normal
Target Milestone: 2.19
Assignee: H.J. Lu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-29 09:25 UTC by Igor S. Zamyatin
Modified: 2014-06-13 08:49 UTC (History)
1 user (show)

See Also:
Host:
Target: i386
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Igor S. Zamyatin 2014-01-29 09:25:33 UTC
Seems 2.17 is a first version with the issue (didn't check 2.16 though). With 2.15 SSE version of floor(double) is used.

Testcase

#include <math.h>
#define N 1000
#define TYPE float
int main(int argc, char* argv[])
{
  TYPE uu = 0.5;
  int i;
 
  for (i = 0; i < N; i++)
    uu += floor(uu + (TYPE)argc);
 
  return (int)uu;
 
}

Compiler options:
-m32 -O2 -mfpmath=sse -ffast-math -march=corei7

Snippet from objdump for 2.15:

 80482d0:       0f 28 c8                movaps %xmm0,%xmm1
 80482d3:       f3 0f 5a c0             cvtss2sd %xmm0,%xmm0
 80482d7:       f3 0f 58 ca             addss  %xmm2,%xmm1
 80482db:       f3 0f 5a c9             cvtss2sd %xmm1,%xmm1
 80482df:       66 0f 3a 0b c9 01       roundsd $0x1,%xmm1,%xmm1  <-- sse floor
 80482e5:       f2 0f 58 c1             addsd  %xmm1,%xmm0
 80482e9:       83 e8 01                sub    $0x1,%eax
 80482ec:       f2 0f 5a c0             cvtsd2ss %xmm0,%xmm0
 80482f0:       75 de                   jne    80482d0 <main+0x20>

2.17:

8048300:       0f 28 c8                movaps %xmm0,%xmm1
 8048303:       66 0f ef db             pxor   %xmm3,%xmm3
 8048307:       f3 0f 58 ca             addss  %xmm2,%xmm1
 804830b:       f3 0f 5a d9             cvtss2sd %xmm1,%xmm3
 804830f:       f2 0f 11 5d e0          movsd  %xmm3,-0x20(%ebp)
 8048314:       dd 45 e0                fldl   -0x20(%ebp)        <---
 8048317:       d9 7d f4                fnstcw -0xc(%ebp)
 804831a:       0f b7 55 f4             movzwl -0xc(%ebp),%edx
 804831e:       81 e2 ff f3 00 00       and    $0xf3ff,%edx
 8048324:       81 ca 00 04 00 00       or     $0x400,%edx        floor
 804832a:       66 89 55 f6             mov    %dx,-0xa(%ebp)
 804832e:       d9 6d f6                fldcw  -0xa(%ebp)
 8048331:       d9 fc                   frndint                  
 8048333:       d9 6d f4                fldcw  -0xc(%ebp)          --->
 8048336:       f3 0f 5a c0             cvtss2sd %xmm0,%xmm0
 804833a:       83 e8 01                sub    $0x1,%eax
 804833d:       dd 5d e0                fstpl  -0x20(%ebp)
 8048340:       f2 0f 58 45 e0          addsd  -0x20(%ebp),%xmm0
 8048345:       f2 0f 5a c0             cvtsd2ss %xmm0,%xmm0
 8048349:       75 b5                   jne    8048300 <main+0x30>

Compiler doesn't generate bultin in this case so floor from math.h file is used. 

Issue probably occured after one of those:

commit 25f1282ae5072ccf586f041356ddde02f069c4ff
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Jun 15 13:56:26 2012 -0700
 
    Use i386 bits/mathinline.h for i386 and x86_64
 
commit ed1825f858842b102f735b129ca1e569e2247809
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Jun 15 06:52:47 2012 -0700
 
    Disable x87 inline functions for x86-64

Issue was found during looking on performance degradation at spec2K/177.mesa happened after migration on newer version of Fedora
Comment 1 H.J. Lu 2014-01-29 14:05:12 UTC
On which machine did you compile the test, i686 or x86-64?
Comment 2 Igor S. Zamyatin 2014-01-29 14:28:10 UTC
Both machines (with 2.17 and 2.15) are x86_64
Comment 3 H.J. Lu 2014-01-29 14:49:41 UTC
Please try it with glibc 2.15 on i686.  I believe you will also
get x87 instructions just like with glibc 2.17.  But it is still
a regression on x86-64.
Comment 4 H.J. Lu 2014-01-29 16:05:40 UTC
A patch is posted at

https://sourceware.org/ml/libc-alpha/2014-01/msg00606.html
Comment 5 H.J. Lu 2014-01-29 21:22:55 UTC
Fixed in 2.19, 2.18.1, 2.17.1 and 2.16.1.