This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] [BZ #2749] soft-fp fixes
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Steven Munroe <munroesj at us dot ibm dot com>, Ulrich Drepper <drepper at redhat dot com>
- Cc: libc-alpha at sources dot redhat dot com, Joe Kerian <jkerian at linux dot ibm dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, libc-ports at sources dot redhat dot com
- Date: Wed, 10 Jan 2007 16:49:25 +0100
- Subject: Re: [PATCH] [BZ #2749] soft-fp fixes
- References: <452E65FE.3030808@us.ibm.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Thu, Oct 12, 2006 at 10:57:50AM -0500, Steven Munroe wrote:
> 2006-10-05 Steven Munroe <sjmunroe@us.ibm.com>
> Joe Kerian <jkerian@us.us.ibm.com>
>
> [BZ #2749]
> * soft-fp/fenv_libc.h: New file.
Top level soft-fp directory doesn't sound like a good place for this,
it has nothing to common with fenv_libc.h. IMHO either stick it into
math/fenv_libc.h, or let just each port provide its own (all current
in tree arches that ever use fenv_libc.h do that, so ppc no-fpu could
do the same).
> * soft-fp/op-common.h (_FP_OVERFLOW_SEMIRAW): Always set inexact
> and overflow for infinity.
> --- libc25-cvstip-20061005/soft-fp/op-common.h 2006-04-04 03:24:47.000000000 -0500
> +++ libc24/soft-fp/op-common.h 2006-10-10 09:45:04.000000000 -0500
> @@ -99,10 +99,10 @@
> else \
> { \
> X##_e = _FP_EXPMAX_##fs - 1; \
> - FP_SET_EXCEPTION(FP_EX_OVERFLOW); \
> - FP_SET_EXCEPTION(FP_EX_INEXACT); \
> _FP_FRAC_SET_##wc(X, _FP_MAXFRAC_##wc); \
> } \
> + FP_SET_EXCEPTION(FP_EX_INEXACT); \
> + FP_SET_EXCEPTION(FP_EX_OVERFLOW); \
> } while (0)
>
> /* Check for a semi-raw value being a signaling NaN and raise the
This change is correct, fixes e.g. the following testcase on
sparc:
#include <fenv.h>
#include <stdio.h>
long double ld1 = __LDBL_MAX__ / 2;
long double ld2 = __LDBL_MAX__;
double d1 = __DBL_MAX__ / 2;
double d2 = __DBL_MAX__;
float f1 = __FLT_MAX__ / 2;
float f2 = __FLT_MAX__;
int
main (void)
{
int fe;
int ret = 0;
feclearexcept (FE_ALL_EXCEPT);
f2 += f1;
fe = fetestexcept (FE_ALL_EXCEPT);
if (fe != (FE_OVERFLOW | FE_INEXACT))
{
printf ("float test failed: %x\n", fe);
ret = 1;
}
feclearexcept (FE_ALL_EXCEPT);
d2 += d1;
fe = fetestexcept (FE_ALL_EXCEPT);
if (fe != (FE_OVERFLOW | FE_INEXACT))
{
printf ("double test failed: %x\n", fe);
ret = 1;
}
feclearexcept (FE_ALL_EXCEPT);
ld2 += ld1;
fe = fetestexcept (FE_ALL_EXCEPT);
if (fe != (FE_OVERFLOW | FE_INEXACT))
{
printf ("long double test failed: %x\n", fe);
ret = 1;
}
return ret;
}
Here both native types set these 2 flags on the overflow, long double
(soft-fp emulated) does not.
> (_FP_PACK_SEMIRAW): Update comment. Do not round if NaN.
> --- libc25-cvstip-20061005/soft-fp/op-common.h 2006-04-04 03:24:47.000000000 -0500
> +++ libc24/soft-fp/op-common.h 2006-10-10 09:45:04.000000000 -0500
> @@ -131,10 +131,12 @@
>
> /* Prepare to pack an fp value in semi-raw mode: the mantissa is
> rounded and shifted right, with the rounding possibly increasing
> - the exponent (including changing a finite value to infinity). */
> + the exponent (including changing a finite value to infinity).
> + Be sure not to round NaNs. */
> #define _FP_PACK_SEMIRAW(fs, wc, X) \
> do { \
> - _FP_ROUND(wc, X); \
> + if(X##_e != _FP_EXPMAX_##fs) \
> + _FP_ROUND(wc, X); \
> if (_FP_FRAC_HIGH_##fs(X) \
> & (_FP_OVERFLOW_##fs >> 1)) \
> { \
Can you please explain where does this make a difference?
NaNs should have the least significant 3 bits (_FP_WORKBITS) cleared
since they were shifted up during unpacking, so even when _FP_ROUND is
called on the NaN, it shouldn't affect it in any way (_FP_ROUND does
nothing if the work bits are all 0).
> * soft-fp/op-4.h (__FP_FRAC_SUB_3, __FP_FRAC_SUB_4): Correct borrow
> handling for high words.
>
> --- libc25-cvstip-20061005/soft-fp/op-4.h 2006-04-04 03:24:47.000000000 -0500
> +++ libc24/soft-fp/op-4.h 2006-10-10 09:45:04.000000000 -0500
> @@ -564,7 +564,7 @@
> r1 = x1 - y1; \
> _c2 = r1 > x1; \
> r1 -= _c1; \
> - _c2 |= r1 > _c1; \
> + _c2 |= _c1 && (y1 == x1); \
> r2 = x2 - y2 - _c2; \
> } while (0)
> #endif
> @@ -578,11 +578,11 @@
> r1 = x1 - y1; \
> _c2 = r1 > x1; \
> r1 -= _c1; \
> - _c2 |= r1 > _c1; \
> + _c2 |= _c1 && (y1 == x1); \
> r2 = x2 - y2; \
> _c3 = r2 > x2; \
> r2 -= _c2; \
> - _c3 |= r2 > _c2; \
> + _c3 |= _c2 && (y2 == x2); \
> r3 = x3 - y3 - _c3; \
> } while (0)
> #endif
This is correct fix and doesn't affect in tree architectures only
because those that use soft-fp are either 64-bit and thus use just op-2.h
and op-1.h (sparc64, alpha) or have their own __FP_FRAC_SUB_{3,4} (sparc32).
On sparc32 this can be reproduced after commenting out the arch specific
__FP_FRAC_SUB_{3,4} definitions in sysdeps/sparc/sparc32/soft-fp/sfp-machine.h
with
#include <stdio.h>
volatile long double d = 0x1.0000000000010000000100000001p+1;
volatile long double e = 0x1.0000000000000000000000000001p+1;
int
main (void)
{
d -= e;
if (d != 0x1.0p-47)
{
printf ("test failed %.28La\n", d);
return 1;
}
return 0;
}
where it prints test failed -0x1.fffffff000000000000000000000p+13.
I guess there are many ways how to fix this, the one from your patch:
r1 = x1 - y1; \
_c2 = r1 > x1; \
r1 -= _c1; \
- _c2 |= r1 > _c1; \
+ _c2 |= _c1 && (y1 == x1); \
r2 = x2 - y2 - _c2; \
(i.e. borrow on r1 -= _c1 if old r1 was zero and _c1 non-zero)
or e.g.:
r1 = x1 - y1; \
_c2 = r1 > x1; \
+ _c2 |= (r1 - _c1) > r1; \
r1 -= _c1; \
- _c2 |= r1 > _c1; \
r2 = x2 - y2 - _c2; \
or:
- r1 = x1 - y1; \
- _c2 = r1 > x1; \
- r1 -= _c1; \
- _c2 |= r1 > _c1; \
+ r2 = x1 - y1; \
+ _c2 = r2 > x1; \
+ r1 = r2 - _c1; \
+ _c2 |= r1 > r2; \
r2 = x2 - y2 - _c2; \
Jakub