This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PATCH: optimized sincosf with SSE for x86_64 and x86_32
Accuracy of sincosf provided by our internal test system
comprehensively checking different intervals and extreme cases is:
Function PASS/ Max Ulp Roun
FAIL error ding
x86_32 (current)
SINCOSF PASS 2.905172 Near
SINCOSF PASS 2.905172 +Inf
SINCOSF PASS 3.905172 -Inf
SINCOSF PASS 3.905172 Zero
x86_32 (with new patch applied)
SINCOSF PASS 0.500573 Near
SINCOSF PASS 1.000000 +Inf
SINCOSF PASS 1.000000 -Inf
SINCOSF PASS 1.000000 Zero
x86_64 (current)
SINCOSF PASS 2.905172 Near
SINCOSF PASS 3.544283 +Inf
SINCOSF PASS 3.905172 -Inf
SINCOSF PASS 3.905172 Zero
x86_64 (with new patch applied)
SINCOSF PASS 0.500573 Near
SINCOSF PASS 1.000000 +Inf
SINCOSF PASS 1.000000 -Inf
SINCOSF PASS 1.000000 Zero
So for SSE versions the error never exceeds 1ulp.
And I would like to clarify our performance boost:
x86_32 Ist. Bulld. Atom Neh. AVX
/*gain in times */
sincosf |x|<0.78 1,79 1,75 1,58 1,34 1,19
sincosf |x|<1.57 1,75 1,83 1,67 1,70 1,34
sincosf |x|<2.35 1,95 2,10 1,71 1,75 1,38
sincosf |x|<3.14 2,11 2,20 1,82 1,94 1,56
sincosf |x|<3.92 2,29 2,25 1,88 2,04 1,67
sincosf |x|<4.71 2,37 2,30 1,92 2,09 1,74
sincosf |x|<5.49 2,43 2,46 1,96 2,15 1,80
sincosf |x|<6.28 2,48 2,35 1,99 2,19 1,81
sincosf |x|<7.06 2,51 2,33 1,99 2,20 1,84
sincosf |x|<7.85 2,36 2,28 1,99 2,17 1,81
sincosf |x|<8.63 2,37 2,35 1,98 2,11 1,77
sincosf |x|<9.42 2,25 2,10 1,96 2,05 1,65
sincosf |x|<100 2,07 1,93 2,01 1,88 1,48
sincosf |x|<1000 15,29 19,48 13,78 12,26 11,38
sincosf |x|<10000 19,31 22,62 17,10 15,35 14,39
sincosf |x|<1e10 18,19 28,12 15,93 14,99 15,30
x84_64 Ist. Bulld.
Atom Neh. AVX /*gain in times */
sincosf |x|<0.78 2,06 2,55 1,23 2,04 2,10
sincosf |x|<1.57 1,80 1,87 1,33 1,82 1,74
sincosf |x|<2.35 1,82 2,16 1,37 1,85 1,79
sincosf |x|<3.14 1,96 2,20 1,45 1,91 1,98
sincosf |x|<3.92 2,05 2,20 1,50 1,98 2,11
sincosf |x|<4.71 2,13 2,29 1,54 2,09 2,13
sincosf |x|<5.49 2,23 2,52 1,58 2,13 2,16
sincosf |x|<6.28 2,29 2,29 1,60 2,18 2,19
sincosf |x|<7.06 2,22 2,16 1,60 2,19 2,23
sincosf |x|<7.85 2,18 2,08 1,61 2,13 2,22
sincosf |x|<8.63 1,99 2,30 1,59 2,08 2,19
sincosf |x|<9.42 1,91 1,85 1,57 1,99 2,17
sincosf |x|<100 1,99 1,88 1,58 2,45 2,08
sincosf |x|<1000 14,05 14,78 13,01 15,69 18,24
sincosf |x|<10000 19,22 18,59 16,05 19,17 22,74
sincosf |x|<1e10 14,21 14,52 14,43 12,45 14,72
--
Liubov Dmitrieva
Intel Corporation
2012/9/9 Rich Felker <dalias@aerifal.cx>:
> On Sat, Sep 08, 2012 at 11:07:11PM +0400, Dmitrieva Liubov wrote:
>> I propose a big patch with the following new features and fixes:
>>
>> 1. SSE optimized sincosf for x86_64 with excellent performance results.
>> 2. SSE optimized sincosf for x86_32 with excellent performance results.
>> 3. Some minor fixes in already merged sinf and cosf (unwind info and
>> the description of special cases in the comment).
>> 4. Fix latent bugs we have already discussed regarding with wrong
>> using of sincosf routine for subnormal arguments.
>>
>> The patch is attached.
>
> Is this doing correct range reduction? Considering how many times the
> range reduction has been broken by optimizations, it would be nice if
> the comments document that it's being done correctly...
>
> Rich