Summary: | powerpc rint() function is buggy in the rounding toward -inf and +inf modes | ||
---|---|---|---|
Product: | glibc | Reporter: | Vincent Lefèvre <vincent-srcware> |
Component: | math | Assignee: | Andreas Jaeger <aj> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | glibc-bugs, sjmunroe |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: | ||
Bug Depends on: | |||
Bug Blocks: | 724 | ||
Attachments: |
[PATCH] for rounding functions
[PATCH] Updated with test cases |
Description
Vincent Lefèvre
2004-12-08 16:33:10 UTC
Some additional information... Here's a part of the output of my program (on a PowerPC machine): ay:~/wd/src/fp> ./nearestint 4.5 1 [...] Rounding toward -oo -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 casttoint -4 -3 -2 -1 0 0 1 2 3 4 trunc -4.0 -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 4.0 floor -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 ceil -4.0 -3.0 -2.0 -1.0 -0.0 1.0 2.0 3.0 4.0 5.0 round -5.0 -4.0 -3.0 -2.0 -1.0 1.0 2.0 3.0 4.0 5.0 nearbyint -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 myrint -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 rint -4.0 -3.0 -2.0 -1.0 0.0 -0.0 1.0 2.0 3.0 4.0 [...] "myrint" corresponds to my code. As expected, it agrees with the nearbyint glibc function (which is correct). In short, the current rint code for negative numbers doesn't work in the rounding modes toward -oo and toward +oo, because it does the first subtraction in the wrong way, so that the result is rounded in the wrong direction. Note: do not try the code on Linux/x86, or any other processor that runs in extended precision; it will not work (in part due to a gcc bug). Well, if you really want to try on such a machine, you can still use the -ffloat-store compiler switch, but the results are not guaranteed for any input. Steven, could you look at this bug, please? Fixed in cvs since June 11 2004. The C implementation mentioned above was replace With optimized asm implementations: 2004-06-11 Dwayne Grant McConnell <dgm69@us.ibm.com> * sysdeps/powerpc/fpu/s_lround.c: Removed. * sysdeps/powerpc/fpu/s_lroundf.c: Removed. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_ceil.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_floor.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_lrint.c: Removed. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_lroundf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_rint.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_round.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: New file. * sysdeps/powerpc/powerpc32/fpu/s_trunc.S: New file. looks like the sysdeps/powerpc/fpu/s_rint.c is redundant and should be removed. The equivalent PPC64 changes when in April 29 2004. Anyway on SLES9 and RHEL4 we get: Rounding toward -oo -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 casttoint -3 -2 -1 0 0 1 2 3 trunc -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 floor -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 ceil -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 round -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 nearbyint -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 myrint -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 rint -3.0 -2.0 -1.0 -0.0 0.0 1.0 2.0 3.0 However it look like there is another problem with fractional values that round to zero: Rounding toward -oo -1.5 -1.0 -0.5 -0.0 0.0 0.5 1.0 1.5 casttoint -1 -1 0 0 0 0 1 1 trunc -1.0 -1.0 -0.0 -0.0 0.0 0.0 1.0 1.0 floor -2.0 -1.0 -1.0 -0.0 0.0 0.0 1.0 1.0 ceil -1.0 -1.0 -0.0 -0.0 0.0 1.0 1.0 2.0 round -2.0 -1.0 -1.0 -0.0 0.0 1.0 1.0 2.0 nearbyint -2.0 -1.0 -1.0 -0.0 0.0 0.0 1.0 1.0 myrint -2.0 -1.0 -1.0 -0.0 0.0 0.0 1.0 1.0 rint -1.0 -1.0 0.0 -0.0 0.0 -0.0 1.0 1.0 I'll look into this tommorrow. Created attachment 320 [details]
[PATCH] for rounding functions
Ok, I found the problem in rint and floor and realized that it would be simpler/faster to force the sign bit to the correct value (using fasb/fnabs instructions) then to compare/branch for x == 0.0 case. The fabs/fnabs has a latency of 6 cycles, while the fcmpu/bc has a latency of 8+2. This simplies the the code and allows the ellimination of the -0.0 constant used in some of the rounding functions. Finally I noticed that some of the float (i.e. roundf, rintf, ...) functions were still using double constant values. so I decides to clean up the whole of rounding functions (ceil, floor, rint, round, trunc) for float and double. This gives a nice performance boost for POWER4, 970 (AKA G5) and POWER5 (10-135%). But I noticed anomalies in the round and trunc results. This depends on the final alignment of the code and in some cases changes the dispatch groupings resulting in branch miss-predictions. So for PPC64 I replace the ENTER macro with EALIGN (?, 4, 0) which force quadword alignment. This improved the results. I did not do this for powerpc32 because I don't have access to powerpc32 machines to test on. Vincent doe this patch resolve your concerns? Can you please also write a testcase for the things current CVS ppc rint computes incorrectly, so that we don't regress and make sure all platforms behave the same? Created attachment 321 [details]
[PATCH] Updated with test cases
This patch adds 4 fesetround() variant test cases to math/libm-test.inc so it
will cover all platforms.
Also fixed a typo in powerpc32/s_rintf.S that was causing test failures.
Re comment #6, unfortunately I can't test on my machine (I have a binary Debian package, and I don't want to try a patch + recompilation now, as I won't be able to recover if something goes really wrong). I assume that a test for the various main cases like what my nearestint.c does is sufficient. I have verified that the additional tests add to math/libm-test.inc (from patch attachment id=321) reports the error in rint with the current CVS for powerpc32 and powerpc64. These errors are resolved by applying the remainder of patch from attachment id=321. This should answer Jakub's requestion from comment #7 and address the testing requested in comments # 8,9. Can we move this to FIXED and get the patch committed to CVS? Applied to cvs. Subject: Bug 602 CVSROOT: /cvs/glibc Module name: libc Branch: glibc-2_3-branch Changes by: roland@sources.redhat.com 2005-02-16 10:07:17 Modified files: sysdeps/powerpc/powerpc32/fpu: s_ceil.S s_roundf.S s_trunc.S s_rint.S s_floorf.S s_ceilf.S s_floor.S s_truncf.S s_round.S s_rintf.S sysdeps/powerpc/powerpc64/fpu: s_trunc.S s_floorf.S s_floor.S s_ceilf.S s_truncf.S s_ceil.S s_rintf.S s_roundf.S s_round.S s_rint.S math : libm-test.inc Log message: 2004-12-22 Steven Munroe <sjmunroe@us.ibm.com> [BZ #602] * math/libm-test.inc (rint_test_tonearest): New test. (rint_test_towardzero): New test. (rint_test_downward): New test. (rint_test_upward): New test. * sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Fix -0.0 case. Remove redundant const values. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Fix -0.0 case. Remove redundant const values. Use float const. * sysdeps/powerpc/powerpc32/fpu/s_floor.S: Fix -0.0 case. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Fix -0.0 case. Use float const. * sysdeps/powerpc/powerpc32/fpu/s_rint.S: Fix -0.0 case. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Fix -0.0 case. Use float const. * sysdeps/powerpc/powerpc32/fpu/s_round.S: Fix -0.0 case. Remove redundant const values. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Fix -0.0 case. Remove redundant const values. Use float const. * sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Fix -0.0 case. Remove redundant const values. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Fix -0.0 case. Remove redundant const values. Use float const. * sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Remove redundant const values. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Remove redundant const values. Use float const. * sysdeps/powerpc/powerpc64/fpu/s_floor.S: Use EALIGN for Quadword alignment. Fix -0.0 case. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Use float const. * sysdeps/powerpc/powerpc64/fpu/s_rint.S: Use EALIGN for Quadword alignment. Fix -0.0 case. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Use float const. * sysdeps/powerpc/powerpc64/fpu/s_round.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Remove redundant const values. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Remove redundant const values. Use float const. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Use EALIGN for Quadword alignment. Fix -0.0 case. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Use EALIGN for Quadword alignment. Fix -0.0 case. Remove redundant const values. Use float const. Patches: http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_ceil.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_roundf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_trunc.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_rint.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_floorf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_ceilf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_floor.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_truncf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_round.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc32/fpu/s_rintf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.1&r2=1.1.4.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_trunc.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_floorf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_floor.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_ceilf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_truncf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_ceil.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_rintf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_roundf.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_round.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/powerpc/powerpc64/fpu/s_rint.S.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.2&r2=1.2.2.1 http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/math/libm-test.inc.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.64&r2=1.64.2.1 |