This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796)

From: "Joseph S. Myers" <joseph at codesourcery dot com>
To: <libc-alpha at sourceware dot org>
Cc: David Miller <davem at davemloft dot net>, <libc-ports at sourceware dot org>
Date: Fri, 2 Nov 2012 02:14:49 +0000
Subject: Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796)
[This patch includes architecture changes, see below; most are pretty
mechanical, but the SPARC changes could perhaps do with review /
testing.]

Another instance of fma code assuming round-to-nearest is bug 14796,
the use of Dekker multiplication and Knuth addition.

Dekker multiplication definitely only works in round-to-nearest in the
case of an odd number of mantissa bits (in other rounding modes, some
of the intermediate multplications can end up multiplying two numbers
with (M+1)/2 bits, and so being inexact).  I don't have testcases for
it not working in the case of an even number of mantissa bits for
ldbl-96 - but in any case, Knuth addition cannot work reliably in
modes other than round-to-nearest (because if x and y are sufficiently
different in size, the difference between the exact value of x+y and a
not-to-nearest rounded value may not itself be exactly representable).
And while the proofs given by Boldo and Melquiond apply more or less
unchanged to the case of the desired rounding for the final result not
being round-to-nearest, they certainly don't allow for the addition
step not forming an exact low part.

So this patch makes all relevant fma implementations set
round-to-nearest for the Dekker and Knuth algorithms (even in the
absence of testcases showing wrong results from the present ldbl-96
code, it doesn't seem wise to go outside the conditions of the
proofs).  In turn this requires other changes:

* Exact zero returns from nonzero arguments previously managed to get
  the correct sign of zero through the Knuth addition being carried
  out in the rounding mode of the final computation.  Now that
  addition is carried out in round-to-nearest, getting -0 from x - x
  in round-downward mode requires new special cases.  Such exact zero
  results can only occur when the multiplication is exact (so m2 = 0)
  and m1 = -z (so a1 = 0, which implies a2 = 0).  Thus appropriate
  checks are introduced.

* Inexact exceptions from the Dekker/Knuth code must be cleared before
  the round-to-odd code (previously the feholdexcept call was located
  in a position to do that); clearing them is a good idea anyway,
  since they are irrelevant to whether the final fma is exact, and as
  a fully specified IEEE arithmetic operation, the fma function should
  be raising inexact exceptins if and only if the final fma is
  inexact.  To avoid check-localplt failures, this requires a hidden
  alias for feclearexcept, so requiring changes to each file
  implementing that function.

* As the dbl-64 code will on x86_64 only be setting the SSE rounding
  mode with libc_feholdexcept_setround, and only restoring that mode
  with libc_feupdateenv, a corresponding libc_fesetround is needed,
  now we have a function that sets the rounding mode twice to two
  different modes before restoring it for the final calculation and
  return.  This results in the fenv_private / math_private.h changes.

Tested x86_64 (multi-arch both enabled and disabled) and x86.

2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	[BZ #14796]
	* sysdeps/ieee754/dbl-64/s_fma.c (__fma): Set rounding mode to
	FE_TONEAREST before applying Dekker multiplication and Knuth
	addition.  Clear inexact exceptions and check for exact zero
	results afterwards.
	* sysdeps/ieee754/ldbl-128/s_fmal.c (__fmal): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Likewise.
	* sysdeps/ieee754/ldbl-96/s_fmal.c (__fmal): Likewise.
	* math/libm-test.inc (fma_test): Add more tests.
	(fma_test_towardzero): Likewise.
	(fma_test_downward): Likewise.
	(fma_test_upward): Likewise.
	* sysdeps/generic/math_private.h (default_libc_fesetround): New
	function.
	(libc_fesetround): New macro.
	(libc_fesetroundf): Likewise.
	(libc_fesetroundl): Likewise.
	* sysdeps/i386/fpu/fenv_private.h (libc_fesetround_sse): New
	function.
	(libc_fesetround_387): Likewise.
	(libc_fesetroundf): New macro.
	(libc_fesetround): Likewise.
	(libc_fesetroundl): Likewise.
	* sysdeps/sparc/fpu/fenv_private.h (libc_fesetround): New
	function.
	(libc_fesetroundf): New macro.
	(libc_fesetround): Likewise.
	(libc_fesetroundl): Likewise.
	* include/fenv.h (feclearexcept): Add libm_hidden_proto.
	* math/fclrexcpt.c (feclearexcept): Add libm_hidden_ver.
	* sysdeps/i386/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.
	* sysdeps/powerpc/fpu/fclrexcpt.c (feclearexcept): Likewise.
	* sysdeps/s390/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_def.
	* sysdeps/sh/sh4/fpu/fclrexcpt.c (feclearexcept): Likewise.
	* sysdeps/sparc/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.
	* sysdeps/x86_64/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_def.

ports/ChangeLog.alpha
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/alpha/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.

ports/ChangeLog.am33
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/am33/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.

ports/ChangeLog.arm
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/arm/fclrexcpt.c (feclearexcept): Add libm_hidden_ver.

ports/ChangeLog.hppa
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/hppa/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_def.

ports/ChangeLog.ia64
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/ia64/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_def.

ports/ChangeLog.m68k
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/m68k/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.

ports/ChangeLog.mips
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/mips/fpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_def.

ports/ChangeLog.powerpc
2012-11-02  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/powerpc/nofpu/fclrexcpt.c (feclearexcept): Add
	libm_hidden_ver.

diff --git a/include/fenv.h b/include/fenv.h
index 59d4c3f..8e36fdb 100644
--- a/include/fenv.h
+++ b/include/fenv.h
@@ -19,5 +19,6 @@ libm_hidden_proto (fesetround)
 libm_hidden_proto (feholdexcept)
 libm_hidden_proto (feupdateenv)
 libm_hidden_proto (fetestexcept)
+libm_hidden_proto (feclearexcept)
 
 #endif
diff --git a/math/fclrexcpt.c b/math/fclrexcpt.c
index dcdcfbb..bbf1011 100644
--- a/math/fclrexcpt.c
+++ b/math/fclrexcpt.c
@@ -30,6 +30,7 @@ __feclearexcept (int excepts)
 strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
 
 stub_warning (feclearexcept)
diff --git a/math/libm-test.inc b/math/libm-test.inc
index 258c960..48241e0 100644
--- a/math/libm-test.inc
+++ b/math/libm-test.inc
@@ -4649,6 +4649,10 @@ fma_test (void)
   TEST_fff_f (fma, 0x1p-149, -0x1p-149, 0x1p-149, 0x1p-149, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-149, 0x1p-149, -0x1p-149, -0x1p-149, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-149, -0x1p-149, -0x1p-149, -0x1p-149, UNDERFLOW_EXCEPTION);
+  TEST_fff_f (fma, 0x0.fffp0, 0x0.fffp0, -0x0.ffep0, 0x1p-24);
+  TEST_fff_f (fma, 0x0.fffp0, -0x0.fffp0, 0x0.ffep0, -0x1p-24);
+  TEST_fff_f (fma, -0x0.fffp0, 0x0.fffp0, 0x0.ffep0, -0x1p-24);
+  TEST_fff_f (fma, -0x0.fffp0, -0x0.fffp0, -0x0.ffep0, 0x1p-24);
 #endif
 #if defined (TEST_DOUBLE) && DBL_MANT_DIG == 53
   TEST_fff_f (fma, 0x1.7fp+13, 0x1.0000000000001p+0, 0x1.ffep-48, 0x1.7f00000000001p+13);
@@ -4695,6 +4699,10 @@ fma_test (void)
   TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, 0x1p-1074, 0x1p-1074, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-1074, 0x1p-1074, -0x1p-1074, -0x1p-1074, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, -0x1p-1074, -0x1p-1074, UNDERFLOW_EXCEPTION);
+  TEST_fff_f (fma, 0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
+  TEST_fff_f (fma, 0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+  TEST_fff_f (fma, -0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+  TEST_fff_f (fma, -0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 64
   TEST_fff_f (fma, -0x8.03fcp+3696L, 0xf.fffffffffffffffp-6140L, 0x8.3ffffffffffffffp-2450L, -0x8.01ecp-2440L);
@@ -4727,6 +4735,10 @@ fma_test (void)
   TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, 0x1p-16445L, 0x1p-16445L, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-16445L, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, -0x1p-16445L, UNDERFLOW_EXCEPTION);
+  TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
+  TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+  TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+  TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 113
   TEST_fff_f (fma, 0x1.bb2de33e02ccbbfa6e245a7c1f71p-2584L, -0x1.6b500daf0580d987f1bc0cadfcddp-13777L, 0x1.613cd91d9fed34b33820e5ab9d8dp-16378L, -0x1.3a79fb50eb9ce887cffa0f09bd9fp-16360L);
@@ -4766,6 +4778,10 @@ fma_test (void)
   TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, 0x1p-16494L, 0x1p-16494L, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-16494L, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, UNDERFLOW_EXCEPTION);
   TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, -0x1p-16494L, UNDERFLOW_EXCEPTION);
+  TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
+  TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+  TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+  TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
 #endif
 
   END (fma);
@@ -4846,6 +4862,10 @@ fma_test_towardzero (void)
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, 0x1p-149, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, 0x1p-149, -0x1p-149, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, -0x1p-149, -0x1p-149, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffp0, 0x0.fffp0, -0x0.ffep0, 0x1p-24);
+      TEST_fff_f (fma, 0x0.fffp0, -0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, 0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, -0x0.fffp0, -0x0.ffep0, 0x1p-24);
 #endif
 #if defined (TEST_DOUBLE) && DBL_MANT_DIG == 53
       TEST_fff_f (fma, 0x1.4p-1022, 0x1.0000000000002p-1, 0x1p-1024, 0x1.c000000000002p-1023, UNDERFLOW_EXCEPTION);
@@ -4872,6 +4892,10 @@ fma_test_towardzero (void)
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, 0x1p-1074, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, 0x1p-1074, -0x1p-1074, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, -0x1p-1074, -0x1p-1074, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 64
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000004p-1L, 0x1p-16384L, 0x1.c000000000000004p-16383L, UNDERFLOW_EXCEPTION);
@@ -4898,6 +4922,10 @@ fma_test_towardzero (void)
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, 0x1p-16445L, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, 0x1p-16445L, -0x1p-16445L, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, -0x1p-16445L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 113
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000000000000000002p-1L, 0x1p-16384L, 0x1.c000000000000000000000000002p-16383L, UNDERFLOW_EXCEPTION);
@@ -4924,6 +4952,10 @@ fma_test_towardzero (void)
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, 0x1p-16494L, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, 0x1p-16494L, -0x1p-16494L, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, -0x1p-16494L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
 #endif
     }
 
@@ -5007,6 +5039,10 @@ fma_test_downward (void)
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, 0x1p-149, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, 0x1p-149, -0x1p-149, -0x1p-149, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, -0x1p-149, -0x1p-148, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffp0, 0x0.fffp0, -0x0.ffep0, 0x1p-24);
+      TEST_fff_f (fma, 0x0.fffp0, -0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, 0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, -0x0.fffp0, -0x0.ffep0, 0x1p-24);
 #endif
 #if defined (TEST_DOUBLE) && DBL_MANT_DIG == 53
       TEST_fff_f (fma, 0x1.4p-1022, 0x1.0000000000002p-1, 0x1p-1024, 0x1.c000000000002p-1023, UNDERFLOW_EXCEPTION);
@@ -5033,6 +5069,10 @@ fma_test_downward (void)
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, 0x1p-1074, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, 0x1p-1074, -0x1p-1074, -0x1p-1074, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, -0x1p-1074, -0x1p-1073, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 64
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000004p-1L, 0x1p-16384L, 0x1.c000000000000004p-16383L, UNDERFLOW_EXCEPTION);
@@ -5059,6 +5099,10 @@ fma_test_downward (void)
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, 0x1p-16445L, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, -0x1p-16444L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 113
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000000000000000002p-1L, 0x1p-16384L, 0x1.c000000000000000000000000002p-16383L, UNDERFLOW_EXCEPTION);
@@ -5085,6 +5129,10 @@ fma_test_downward (void)
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, 0x1p-16494L, plus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, -0x1p-16493L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
 #endif
     }
 
@@ -5168,6 +5216,10 @@ fma_test_upward (void)
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, 0x1p-149, 0x1p-149, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, 0x1p-149, -0x1p-149, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-149, -0x1p-149, -0x1p-149, -0x1p-149, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffp0, 0x0.fffp0, -0x0.ffep0, 0x1p-24);
+      TEST_fff_f (fma, 0x0.fffp0, -0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, 0x0.fffp0, 0x0.ffep0, -0x1p-24);
+      TEST_fff_f (fma, -0x0.fffp0, -0x0.fffp0, -0x0.ffep0, 0x1p-24);
 #endif
 #if defined (TEST_DOUBLE) && DBL_MANT_DIG == 53
       TEST_fff_f (fma, 0x1.4p-1022, 0x1.0000000000002p-1, 0x1p-1024, 0x1.c000000000004p-1023, UNDERFLOW_EXCEPTION);
@@ -5194,6 +5246,10 @@ fma_test_upward (void)
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, 0x1p-1074, 0x1p-1074, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, 0x1p-1074, -0x1p-1074, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-1074, -0x1p-1074, -0x1p-1074, -0x1p-1074, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
+      TEST_fff_f (fma, 0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, 0x0.fffffffffffff8p0, 0x0.fffffffffffffp0, -0x1p-106);
+      TEST_fff_f (fma, -0x0.fffffffffffff8p0, -0x0.fffffffffffff8p0, -0x0.fffffffffffffp0, 0x1p-106);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 64
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000004p-1L, 0x1p-16384L, 0x1.c000000000000008p-16383L, UNDERFLOW_EXCEPTION);
@@ -5220,6 +5276,10 @@ fma_test_upward (void)
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, 0x1p-16445L, 0x1p-16445L, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, 0x1p-16445L, -0x1p-16445L, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16445L, -0x1p-16445L, -0x1p-16445L, -0x1p-16445L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, 0x0.ffffffffffffffffp0L, 0x0.fffffffffffffffep0L, -0x1p-128L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffp0L, -0x0.ffffffffffffffffp0L, -0x0.fffffffffffffffep0L, 0x1p-128L);
 #endif
 #if defined (TEST_LDOUBLE) && LDBL_MANT_DIG == 113
       TEST_fff_f (fma, 0x1.4p-16382L, 0x1.0000000000000000000000000002p-1L, 0x1p-16384L, 0x1.c000000000000000000000000004p-16383L, UNDERFLOW_EXCEPTION);
@@ -5246,6 +5306,10 @@ fma_test_upward (void)
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, 0x1p-16494L, 0x1p-16494L, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, 0x1p-16494L, -0x1p-16494L, minus_zero, UNDERFLOW_EXCEPTION);
       TEST_fff_f (fma, 0x1p-16494L, -0x1p-16494L, -0x1p-16494L, -0x1p-16494L, UNDERFLOW_EXCEPTION);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
+      TEST_fff_f (fma, 0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffff8p0L, 0x0.ffffffffffffffffffffffffffffp0L, -0x1p-226L);
+      TEST_fff_f (fma, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffff8p0L, -0x0.ffffffffffffffffffffffffffffp0L, 0x1p-226L);
 #endif
     }
 
diff --git a/ports/sysdeps/alpha/fpu/fclrexcpt.c b/ports/sysdeps/alpha/fpu/fclrexcpt.c
index 2d2bd94..8459652 100644
--- a/ports/sysdeps/alpha/fpu/fclrexcpt.c
+++ b/ports/sysdeps/alpha/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997,99,2000,01 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Richard Henderson <rth@tamu.edu>, 1997.
 
@@ -43,4 +43,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/ports/sysdeps/am33/fpu/fclrexcpt.c b/ports/sysdeps/am33/fpu/fclrexcpt.c
index 2b15f45..492ea38 100644
--- a/ports/sysdeps/am33/fpu/fclrexcpt.c
+++ b/ports/sysdeps/am33/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1998, 1999, 2000, 2002, 2004 Free Software Foundation, Inc.
+   Copyright (C) 1998-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Alexandre Oliva <aoliva@redhat.com>
    based on corresponding file in the MIPS port.
@@ -48,4 +48,5 @@ __feclearexcept (int excepts)
   return 0;
 }
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/ports/sysdeps/arm/fclrexcpt.c b/ports/sysdeps/arm/fclrexcpt.c
index 23435fb..fb106a5 100644
--- a/ports/sysdeps/arm/fclrexcpt.c
+++ b/ports/sysdeps/arm/fclrexcpt.c
@@ -54,4 +54,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/ports/sysdeps/hppa/fpu/fclrexcpt.c b/ports/sysdeps/hppa/fpu/fclrexcpt.c
index 5d1e593..bbc2c70 100644
--- a/ports/sysdeps/hppa/fpu/fclrexcpt.c
+++ b/ports/sysdeps/hppa/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 2000 Free Software Foundation, Inc.
+   Copyright (C) 2000-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by David Huggins-Daines <dhd@debian.org>, 2000
 
@@ -33,3 +33,4 @@ feclearexcept (int excepts)
   /* Success.  */
   return 0;
 }
+libm_hidden_def (feclearexcept)
diff --git a/ports/sysdeps/ia64/fpu/fclrexcpt.c b/ports/sysdeps/ia64/fpu/fclrexcpt.c
index 84f8327..c099fd3 100644
--- a/ports/sysdeps/ia64/fpu/fclrexcpt.c
+++ b/ports/sysdeps/ia64/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997, 1999, 2000 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Christian Boissat <Christian.Boissat@cern.ch>, 1999 and
                   Jes Sorensen <Jes.Sorensen@cern.ch>, 2000
@@ -36,3 +36,4 @@ feclearexcept (int excepts)
   /* success */
   return 0;
 }
+libm_hidden_def (feclearexcept)
diff --git a/ports/sysdeps/m68k/fpu/fclrexcpt.c b/ports/sysdeps/m68k/fpu/fclrexcpt.c
index ceda99c..cfc8d9e 100644
--- a/ports/sysdeps/m68k/fpu/fclrexcpt.c
+++ b/ports/sysdeps/m68k/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997,99,2000,01 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Andreas Schwab <schwab@issan.informatik.uni-dortmund.de>
 
@@ -46,4 +46,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/ports/sysdeps/mips/fpu/fclrexcpt.c b/ports/sysdeps/mips/fpu/fclrexcpt.c
index f97d892..f4709b4 100644
--- a/ports/sysdeps/mips/fpu/fclrexcpt.c
+++ b/ports/sysdeps/mips/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1998, 1999, 2000, 2002 Free Software Foundation, Inc.
+   Copyright (C) 1998-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Andreas Jaeger <aj@suse.de>, 1998.
 
@@ -44,3 +44,4 @@ feclearexcept (int excepts)
   /* Success.  */
   return 0;
 }
+libm_hidden_def (feclearexcept)
diff --git a/ports/sysdeps/powerpc/nofpu/fclrexcpt.c b/ports/sysdeps/powerpc/nofpu/fclrexcpt.c
index 768fd8f..f4b9016 100644
--- a/ports/sysdeps/powerpc/nofpu/fclrexcpt.c
+++ b/ports/sysdeps/powerpc/nofpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear floating-point exceptions (soft-float edition).
-   Copyright (C) 2002 Free Software Foundation, Inc.
+   Copyright (C) 2002-2012 Free Software Foundation, Inc.
    Contributed by Aldy Hernandez <aldyh@redhat.com>, 2002.
    This file is part of the GNU C Library.
 
@@ -33,4 +33,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/sysdeps/generic/math_private.h b/sysdeps/generic/math_private.h
index b375bc0..7661788 100644
--- a/sysdeps/generic/math_private.h
+++ b/sysdeps/generic/math_private.h
@@ -402,6 +402,22 @@ default_libc_feholdexcept (fenv_t *e)
 #endif
 
 static __always_inline void
+default_libc_fesetround (int r)
+{
+  (void) fesetround (r);
+}
+
+#ifndef libc_fesetround
+# define libc_fesetround  default_libc_fesetround
+#endif
+#ifndef libc_fesetroundf
+# define libc_fesetroundf default_libc_fesetround
+#endif
+#ifndef libc_fesetroundl
+# define libc_fesetroundl default_libc_fesetround
+#endif
+
+static __always_inline void
 default_libc_feholdexcept_setround (fenv_t *e, int r)
 {
   feholdexcept (e);
diff --git a/sysdeps/i386/fpu/fclrexcpt.c b/sysdeps/i386/fpu/fclrexcpt.c
index f24d07f..f28ef6b 100644
--- a/sysdeps/i386/fpu/fclrexcpt.c
+++ b/sysdeps/i386/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997,99,2000, 2001, 2003, 2004 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@cygnus.com>, 1997.
 
@@ -65,4 +65,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/sysdeps/i386/fpu/fenv_private.h b/sysdeps/i386/fpu/fenv_private.h
index f33f57c..03f4c97 100644
--- a/sysdeps/i386/fpu/fenv_private.h
+++ b/sysdeps/i386/fpu/fenv_private.h
@@ -77,6 +77,24 @@ libc_feholdexcept_387 (fenv_t *e)
 }
 
 static __always_inline void
+libc_fesetround_sse (int r)
+{
+  unsigned int mxcsr;
+  asm (STMXCSR " %0" : "=m" (*&mxcsr));
+  mxcsr = (mxcsr & ~0x6000) | (r << 3);
+  asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr));
+}
+
+static __always_inline void
+libc_fesetround_387 (int r)
+{
+  fpu_control_t cw;
+  _FPU_GETCW (cw);
+  cw = (cw & ~0xc00) | r;
+  _FPU_SETCW (cw);
+}
+
+static __always_inline void
 libc_feholdexcept_setround_sse (fenv_t *e, int r)
 {
   unsigned int mxcsr;
@@ -247,6 +265,7 @@ libc_feresetround_387 (fenv_t *e)
 
 #ifdef __SSE_MATH__
 # define libc_feholdexceptf		libc_feholdexcept_sse
+# define libc_fesetroundf		libc_fesetround_sse
 # define libc_feholdexcept_setroundf	libc_feholdexcept_setround_sse
 # define libc_fetestexceptf		libc_fetestexcept_sse
 # define libc_fesetenvf			libc_fesetenv_sse
@@ -256,6 +275,7 @@ libc_feresetround_387 (fenv_t *e)
 # define libc_feresetroundf		libc_feresetround_sse
 #else
 # define libc_feholdexceptf		libc_feholdexcept_387
+# define libc_fesetroundf		libc_fesetround_387
 # define libc_feholdexcept_setroundf	libc_feholdexcept_setround_387
 # define libc_fetestexceptf		libc_fetestexcept_387
 # define libc_fesetenvf			libc_fesetenv_387
@@ -267,6 +287,7 @@ libc_feresetround_387 (fenv_t *e)
 
 #ifdef __SSE2_MATH__
 # define libc_feholdexcept		libc_feholdexcept_sse
+# define libc_fesetround		libc_fesetround_sse
 # define libc_feholdexcept_setround	libc_feholdexcept_setround_sse
 # define libc_fetestexcept		libc_fetestexcept_sse
 # define libc_fesetenv			libc_fesetenv_sse
@@ -276,6 +297,7 @@ libc_feresetround_387 (fenv_t *e)
 # define libc_feresetround		libc_feresetround_sse
 #else
 # define libc_feholdexcept		libc_feholdexcept_387
+# define libc_fesetround		libc_fesetround_387
 # define libc_feholdexcept_setround	libc_feholdexcept_setround_387
 # define libc_fetestexcept		libc_fetestexcept_387
 # define libc_fesetenv			libc_fesetenv_387
@@ -286,6 +308,7 @@ libc_feresetround_387 (fenv_t *e)
 #endif /* __SSE2_MATH__ */
 
 #define libc_feholdexceptl		libc_feholdexcept_387
+#define libc_fesetroundl		libc_fesetround_387
 #define libc_feholdexcept_setroundl	libc_feholdexcept_setround_387
 #define libc_fetestexceptl		libc_fetestexcept_387
 #define libc_fesetenvl			libc_fesetenv_387
diff --git a/sysdeps/ieee754/dbl-64/s_fma.c b/sysdeps/ieee754/dbl-64/s_fma.c
index 68c63a1..07fe715 100644
--- a/sysdeps/ieee754/dbl-64/s_fma.c
+++ b/sysdeps/ieee754/dbl-64/s_fma.c
@@ -167,6 +167,9 @@ __fma (double x, double y, double z)
   if (__builtin_expect ((x == 0 || y == 0) && z == 0, 0))
     return x * y + z;
 
+  fenv_t env;
+  libc_feholdexcept_setround (&env, FE_TONEAREST);
+
   /* Multiplication m1 + m2 = x * y using Dekker's algorithm.  */
 #define C ((1 << (DBL_MANT_DIG + 1) / 2) + 1)
   double x1 = x * C;
@@ -185,9 +188,20 @@ __fma (double x, double y, double z)
   t1 = m1 - t1;
   t2 = z - t2;
   double a2 = t1 + t2;
+  feclearexcept (FE_INEXACT);
 
-  fenv_t env;
-  libc_feholdexcept_setround (&env, FE_TOWARDZERO);
+  /* If the result is an exact zero, ensure it has the correct
+     sign.  */
+  if (a1 == 0 && m2 == 0)
+    {
+      libc_feupdateenv (&env);
+      /* Ensure that round-to-nearest value of z + m1 is not
+	 reused.  */
+      asm volatile ("" : "=m" (z) : "m" (z));
+      return z + m1;
+    }
+
+  libc_fesetround (FE_TOWARDZERO);
 
   /* Perform m2 + a2 addition with round to odd.  */
   u.d = a2 + m2;
diff --git a/sysdeps/ieee754/ldbl-128/s_fmal.c b/sysdeps/ieee754/ldbl-128/s_fmal.c
index c6a3d71..d576403 100644
--- a/sysdeps/ieee754/ldbl-128/s_fmal.c
+++ b/sysdeps/ieee754/ldbl-128/s_fmal.c
@@ -170,6 +170,10 @@ __fmal (long double x, long double y, long double z)
   if (__builtin_expect ((x == 0 || y == 0) && z == 0, 0))
     return x * y + z;
 
+  fenv_t env;
+  feholdexcept (&env);
+  fesetround (FE_TONEAREST);
+
   /* Multiplication m1 + m2 = x * y using Dekker's algorithm.  */
 #define C ((1LL << (LDBL_MANT_DIG + 1) / 2) + 1)
   long double x1 = x * C;
@@ -188,9 +192,19 @@ __fmal (long double x, long double y, long double z)
   t1 = m1 - t1;
   t2 = z - t2;
   long double a2 = t1 + t2;
+  feclearexcept (FE_INEXACT);
+
+  /* If the result is an exact zero, ensure it has the correct
+     sign.  */
+  if (a1 == 0 && m2 == 0)
+    {
+      feupdateenv (&env);
+      /* Ensure that round-to-nearest value of z + m1 is not
+	 reused.  */
+      asm volatile ("" : "=m" (z) : "m" (z));
+      return z + m1;
+    }
 
-  fenv_t env;
-  feholdexcept (&env);
   fesetround (FE_TOWARDZERO);
   /* Perform m2 + a2 addition with round to odd.  */
   u.d = a2 + m2;
diff --git a/sysdeps/ieee754/ldbl-96/s_fma.c b/sysdeps/ieee754/ldbl-96/s_fma.c
index 001d806..bf2d794 100644
--- a/sysdeps/ieee754/ldbl-96/s_fma.c
+++ b/sysdeps/ieee754/ldbl-96/s_fma.c
@@ -42,6 +42,10 @@ __fma (double x, double y, double z)
   if (__builtin_expect ((x == 0 || y == 0) && z == 0, 0))
     return x * y + z;
 
+  fenv_t env;
+  feholdexcept (&env);
+  fesetround (FE_TONEAREST);
+
   /* Multiplication m1 + m2 = x * y using Dekker's algorithm.  */
 #define C ((1ULL << (LDBL_MANT_DIG + 1) / 2) + 1)
   long double x1 = (long double) x * C;
@@ -60,9 +64,19 @@ __fma (double x, double y, double z)
   t1 = m1 - t1;
   t2 = z - t2;
   long double a2 = t1 + t2;
+  feclearexcept (FE_INEXACT);
+
+  /* If the result is an exact zero, ensure it has the correct
+     sign.  */
+  if (a1 == 0 && m2 == 0)
+    {
+      feupdateenv (&env);
+      /* Ensure that round-to-nearest value of z + m1 is not
+	 reused.  */
+      asm volatile ("" : "=m" (z) : "m" (z));
+      return z + m1;
+    }
 
-  fenv_t env;
-  feholdexcept (&env);
   fesetround (FE_TOWARDZERO);
   /* Perform m2 + a2 addition with round to odd.  */
   a2 = a2 + m2;
diff --git a/sysdeps/ieee754/ldbl-96/s_fmal.c b/sysdeps/ieee754/ldbl-96/s_fmal.c
index b5fdcba..32e71a1 100644
--- a/sysdeps/ieee754/ldbl-96/s_fmal.c
+++ b/sysdeps/ieee754/ldbl-96/s_fmal.c
@@ -168,6 +168,10 @@ __fmal (long double x, long double y, long double z)
   if (__builtin_expect ((x == 0 || y == 0) && z == 0, 0))
     return x * y + z;
 
+  fenv_t env;
+  feholdexcept (&env);
+  fesetround (FE_TONEAREST);
+
   /* Multiplication m1 + m2 = x * y using Dekker's algorithm.  */
 #define C ((1LL << (LDBL_MANT_DIG + 1) / 2) + 1)
   long double x1 = x * C;
@@ -186,9 +190,19 @@ __fmal (long double x, long double y, long double z)
   t1 = m1 - t1;
   t2 = z - t2;
   long double a2 = t1 + t2;
+  feclearexcept (FE_INEXACT);
+
+  /* If the result is an exact zero, ensure it has the correct
+     sign.  */
+  if (a1 == 0 && m2 == 0)
+    {
+      feupdateenv (&env);
+      /* Ensure that round-to-nearest value of z + m1 is not
+	 reused.  */
+      asm volatile ("" : "=m" (z) : "m" (z));
+      return z + m1;
+    }
 
-  fenv_t env;
-  feholdexcept (&env);
   fesetround (FE_TOWARDZERO);
   /* Perform m2 + a2 addition with round to odd.  */
   u.d = a2 + m2;
diff --git a/sysdeps/powerpc/fpu/fclrexcpt.c b/sysdeps/powerpc/fpu/fclrexcpt.c
index 87376bf..e99cc58 100644
--- a/sysdeps/powerpc/fpu/fclrexcpt.c
+++ b/sysdeps/powerpc/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997,99,2000,01 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -44,4 +44,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/sysdeps/s390/fpu/fclrexcpt.c b/sysdeps/s390/fpu/fclrexcpt.c
index 2352d74..3e8d9bb 100644
--- a/sysdeps/s390/fpu/fclrexcpt.c
+++ b/sysdeps/s390/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 2000 Free Software Foundation, Inc.
+   Copyright (C) 2000-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -37,3 +37,4 @@ feclearexcept (int excepts)
   /* Success.  */
   return 0;
 }
+libm_hidden_def (feclearexcept)
diff --git a/sysdeps/sh/sh4/fpu/fclrexcpt.c b/sysdeps/sh/sh4/fpu/fclrexcpt.c
index b4b2ead..2d31fa6 100644
--- a/sysdeps/sh/sh4/fpu/fclrexcpt.c
+++ b/sysdeps/sh/sh4/fpu/fclrexcpt.c
@@ -39,3 +39,4 @@ feclearexcept (int excepts)
 
   return 0;
 }
+libm_hidden_def (feclearexcept)
diff --git a/sysdeps/sparc/fpu/fclrexcpt.c b/sysdeps/sparc/fpu/fclrexcpt.c
index 82b6ac4..50ec4f4 100644
--- a/sysdeps/sparc/fpu/fclrexcpt.c
+++ b/sysdeps/sparc/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 1997, 1999, 2000 Free Software Foundation, Inc.
+   Copyright (C) 1997-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -39,4 +39,5 @@ strong_alias (__feclearexcept, __old_feclearexcept)
 compat_symbol (libm, __old_feclearexcept, feclearexcept, GLIBC_2_1);
 #endif
 
+libm_hidden_ver (__feclearexcept, feclearexcept)
 versioned_symbol (libm, __feclearexcept, feclearexcept, GLIBC_2_2);
diff --git a/sysdeps/sparc/fpu/fenv_private.h b/sysdeps/sparc/fpu/fenv_private.h
index a6e8e95..8690879 100644
--- a/sysdeps/sparc/fpu/fenv_private.h
+++ b/sysdeps/sparc/fpu/fenv_private.h
@@ -14,6 +14,15 @@ libc_feholdexcept (fenv_t *e)
 }
 
 static __always_inline void
+libc_fesetround (int r)
+{
+  fenv_t etmp;
+  __fenv_stfsr(etmp);
+  etmp = (etmp & ~__FE_ROUND_MASK) | (r);
+  __fenv_ldfsr(etmp);
+}
+
+static __always_inline void
 libc_feholdexcept_setround (fenv_t *e, int r)
 {
   fenv_t etmp;
@@ -79,6 +88,7 @@ libc_feresetround (fenv_t *e)
 }
 
 #define libc_feholdexceptf		libc_feholdexcept
+#define libc_fesetroundf		libc_fesetround
 #define libc_feholdexcept_setroundf	libc_feholdexcept_setround
 #define libc_fetestexceptf		libc_fetestexcept
 #define libc_fesetenvf			libc_fesetenv
@@ -87,6 +97,7 @@ libc_feresetround (fenv_t *e)
 #define libc_feholdsetroundf		libc_feholdsetround
 #define libc_feresetroundf		libc_feresetround
 #define libc_feholdexcept		libc_feholdexcept
+#define libc_fesetround			libc_fesetround
 #define libc_feholdexcept_setround	libc_feholdexcept_setround
 #define libc_fetestexcept		libc_fetestexcept
 #define libc_fesetenv			libc_fesetenv
@@ -95,6 +106,7 @@ libc_feresetround (fenv_t *e)
 #define libc_feholdsetround		libc_feholdsetround
 #define libc_feresetround		libc_feresetround
 #define libc_feholdexceptl		libc_feholdexcept
+#define libc_fesetroundl		libc_fesetround
 #define libc_feholdexcept_setroundl	libc_feholdexcept_setround
 #define libc_fetestexceptl		libc_fetestexcept
 #define libc_fesetenvl			libc_fesetenv
diff --git a/sysdeps/x86_64/fpu/fclrexcpt.c b/sysdeps/x86_64/fpu/fclrexcpt.c
index 5ef3162..2d40313 100644
--- a/sysdeps/x86_64/fpu/fclrexcpt.c
+++ b/sysdeps/x86_64/fpu/fclrexcpt.c
@@ -1,5 +1,5 @@
 /* Clear given exceptions in current floating-point environment.
-   Copyright (C) 2001 Free Software Foundation, Inc.
+   Copyright (C) 2001-2012 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -49,3 +49,4 @@ feclearexcept (int excepts)
   /* Success.  */
   return 0;
 }
+libm_hidden_def (feclearexcept)

-- 
Joseph S. Myers
joseph@codesourcery.com
Follow-Ups:
- Re: Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796)
  - From: David Miller
- Re: Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796)
  - From: David Miller
- Re: Make fma use of Dekker and Knuth algorithms use round-to-nearest (bug 14796)
  - From: Richard Henderson
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]