There is a subtle rounding error in decimal_string->BFP conversion (e.g. strtod()) for values that are equal to, or very close to, half-way points, i.e. sufficiently-long prefixes of the exact decimal representation of a binary half-way point. An example is strtod(" 3.518437208883201171875E+013 ") which truncates to the odd 0x42c0000000000001 when in fact default IEEE rounding should have rounded up to 0x42c0000000000002 (nearest-ties-to-even). In fact, strtod only looks at the first 20 digits (well, it scans the others for validity), so that strtod(" 3.518437208883201171999E+013 " also rounds down, even though it is definitely above the half-way point. Indeed, with an extended fraction this would be JL16'+3.518437208883201171999E+013' = 42C00000 00000001 800A66E1 358F83EC (This is the output of the IBM Research experimental assembler Phantasm, for which I wrote the conversion routines ten years ago.) The first number to round up correctly is "3.518437208883201172000E+013". * (I hope a fixed-width font is used here) ....:....1....:....2....:....3 Here is the broken part of strtod.c -- the comment is simply wrong in the case of exact half-way numbers, where all digits are significant in order to determine which way to round! /* For the fractional part we need not process too many digits. One decimal digits gives us log_2(10) ~ 3.32 bits. If we now compute ceil(BITS / 3) =: N digits we should have enough bits for the result. The remaining decimal digits give us the information that more bits are following. This can be used while rounding. (One added as a safety margin.) */ if (dig_no - int_no > (MANT_DIG - bits + 2) / 3 + 1) { dig_no = int_no + (MANT_DIG - bits + 2) / 3 + 1; more_bits = 1; } else more_bits = 0; The basic idea of limiting the number of decimal digits that need to participate in conversion arithmetic is a good one, btw; it's just that the limit picked by strtod() is too small and does not take the exponent into account. As many digits as it takes to represent a binary halfway point exactly must be processed. The worst case is 752 digits for double, for numbers near Nmin/2. If the binary exponent is known at this point (I think it is, based on quick code browsing), the max can be computed. I enclose the comments from my own conversion program: * * When the normalised binary exponent B exceeds the frac- * tion length F (in bits), the length of an exact decimal * represention is .3 B digits, otherwise (e.g. negative B) * it is (F + .7 B) digits (or (F-B) + .3 B). We need to * detect exact half-way points, so we need to consider F+1 * instead of the F derived from the format. * (Here .3 and .7 are short for log10(2) and log10(5) of course.) Michel.

I just encountered something very similar. I started looking at C++ operator>>, which led me to scanf, which led me to strtof. It appears that glibc's strtof doesn't adhere to the "Recommended Practice" of section 6.4.4.2 and 7.20.1.3 of C99. I see numerous FAILs in the attached code with strtof. I haven't seen any problems with strtod, but I haven't done exhaustive tests. Platforms: Opteron, Centos, glibc-2.3.4-2, gcc-4.1.1 Pentium 4, Fedora, glibc-2.3.3-27.1, gcc-3.3.3-7 I have not tested against glibc's CVS, but even if it passes, I would suggest that something like the tests here belong in the test suite. The recommended practice for literal floating constants in 6.4.4.2 is: The translation-time conversion of floating constants should match the execution-time conversion of character strings by library functions, such as strtod, given matching inputs suitable for both conversions, the same result format, and default execution-time rounding.63) 63) The specification for the library functions recommends more accurate conversion than required for floating constants (see 7.20.1.3). The attached code checks this condition. It's *much* easier to check this than to check the more detailed requirement of 7.20.1.3. As a result, the code doesn't actually prove that glibc is in error. It's possible that gcc's handling of literal floats is in error. The recommended practice for strto* in 7.20.1.3 is: If the subject sequence has the decimal form and at most DECIMAL_DIG (defined in <float.h>) significant digits, the result should be correctly rounded. If the subject sequence D has the decimal form and more than DECIMAL_DIG significant digits, consider the two bounding, adjacent decimal strings L and U, both having DECIMAL_DIG significant digits, such that the values of L, D, and U satisfy L <= D <= U. The result should be one of the (equal or adjacent) values that would be obtained by correctly rounding L and U according to the current rounding direction, with the extra stipulation that the error with respect to D should have a correct sign for the current rounding direction. The decimal strings in the attached test case are all less than or equal to DECIMAL_DIG in length. A close look at the output suggests that glibc is in error, at least if one believes the output of printf("%.30g"). For example, for the first FAIL: The input string is: "1.00000005960464477550" The possible neighboring floats are: "1.00000011920928955078125000000" (printf("%.30g", 1 + FLT_EPSILON)) and "1.0" According to dc, the distance from the input to 1+FLT_EPSILON is: .00000005960464477528125000000 and the distance from the input to 1.0 is .00000005960464477550 so 1+FLT_EPSILON should be the "correct" properly rounded result. strtof returns 1.0. --------------------------------------- [jsalmon@river lexical_cast]$ cat badstrtod2.c #define _ISOC99_SOURCE_ #include <stdio.h> #include <stdlib.h> #include <limits.h> #include <math.h> #include <float.h> void checkf(const char *s, float f){ float cf = strtof(s, NULL); long double cld = strtold(s, NULL); if( cf != f ){ printf("FAIL strtof (\"%s\") %a is not the same as literal %a\n", s, cf, f); }else{ printf("OK strtof (\"%s\") %a is the same as literal\n", s, cf); } } void checkd(const char *s, double d){ double cd = strtod(s, NULL); if( cd != d ){ printf("FAIL strtod (\"%s\") %a is not the same as literal %a\n", s, cd, d); }else{ printf("OK strtod (\"%s\") %a is the same as literal\n", s, cd); } } int main(int argc, char **argv){ long double ld; /* Put some context in the output. These are not used for testing */ ld = 1.L + (long double)FLT_EPSILON; printf("1 + FLT_EPSILON: %La %#.30Lg\n", ld, ld); ld = 1.L + ((long double)FLT_EPSILON)/2.; printf("1 + FLT_EPSILON/2.: %La %#.30Lg\n", ld, ld); ld = 1.L + ((long double)FLT_EPSILON)/2. + LDBL_EPSILON; printf("1 + FLT_EPSILON/2. + LDBL_EPSILON: %La %#.30Lg\n", ld, ld); #define CHKF(s) checkf(#s, s##f); /* 1 + FLT_EPSILON/2. + LDBL_EPSILON */ CHKF(1.00000005960464477550); CHKF(1.0000000596046447755); CHKF(1.000000059604644776); CHKF(1.000000059604644775); CHKF(1.00000005960464478); CHKF(1.0000000596046448); CHKF(1.000000059604645); CHKF(1.00000005960464); CHKF(1.0000000596046); CHKF(1.000000059605); CHKF(1.00000005960); CHKF(1.0000000596); CHKF(1.000000060); CHKF(1.00000006); CHKF(1.0000001); CHKF(1.000000); ld = 1.L + (long double)DBL_EPSILON; printf("1 + DBL_EPSILON: %La %#.30Lg\n", ld, ld); ld = 1.L + ((long double)DBL_EPSILON)/2.; printf("1 + DBL_EPSILON/2.: %La %#.30Lg\n", ld, ld); ld = 1.L + ((long double)DBL_EPSILON)/2. + LDBL_EPSILON; printf("1 + DBL_EPSILON/2. + LDBL_EPSILON: %La %#.30Lg\n", ld, ld); #define CHKD(s) checkd(#s, s); CHKD(1.00000000000000011113); CHKD(1.00000000000000011103); CHKD(1.00000000000000011102); CHKD(1.00000000000000011101); CHKD(1.0000000000000001111); CHKD(1.000000000000000111); CHKD(1.00000000000000011); CHKD(1.0000000000000001); return 0; } [jsalmon@river lexical_cast]$ [jsalmon@river lexical_cast]$ gcc -g -std=c99 badstrtod2.c -lm -o badstrtod2 && badstrtod2 1 + FLT_EPSILON: 0x8.00001p-3 1.00000011920928955078125000000 1 + FLT_EPSILON/2.: 0x8.000008p-3 1.00000005960464477539062500000 1 + FLT_EPSILON/2. + LDBL_EPSILON: 0x8.000008000000001p-3 1.00000005960464477549904521725 FAIL strtof ("1.00000005960464477550") 0x1p+0 is not the same as literal 0x1.000002p+0 FAIL strtof ("1.0000000596046447755") 0x1p+0 is not the same as literal 0x1.000002p+0 FAIL strtof ("1.000000059604644776") 0x1p+0 is not the same as literal 0x1.000002p+0 OK strtof ("1.000000059604644775") 0x1p+0 is the same as literal FAIL strtof ("1.00000005960464478") 0x1p+0 is not the same as literal 0x1.000002p+0 FAIL strtof ("1.0000000596046448") 0x1p+0 is not the same as literal 0x1.000002p+0 FAIL strtof ("1.000000059604645") 0x1p+0 is not the same as literal 0x1.000002p+0 OK strtof ("1.00000005960464") 0x1p+0 is the same as literal OK strtof ("1.0000000596046") 0x1p+0 is the same as literal FAIL strtof ("1.000000059605") 0x1p+0 is not the same as literal 0x1.000002p+0 OK strtof ("1.00000005960") 0x1p+0 is the same as literal OK strtof ("1.0000000596") 0x1p+0 is the same as literal OK strtof ("1.000000060") 0x1.000002p+0 is the same as literal OK strtof ("1.00000006") 0x1.000002p+0 is the same as literal OK strtof ("1.0000001") 0x1.000002p+0 is the same as literal OK strtof ("1.000000") 0x1p+0 is the same as literal 1 + DBL_EPSILON: 0x8.0000000000008p-3 1.00000000000000022204460492503 1 + DBL_EPSILON/2.: 0x8.0000000000004p-3 1.00000000000000011102230246252 1 + DBL_EPSILON/2. + LDBL_EPSILON: 0x8.000000000000401p-3 1.00000000000000011113072267976 OK strtod ("1.00000000000000011113") 0x1.0000000000001p+0 is the same as literal OK strtod ("1.00000000000000011103") 0x1.0000000000001p+0 is the same as literal OK strtod ("1.00000000000000011102") 0x1p+0 is the same as literal OK strtod ("1.00000000000000011101") 0x1p+0 is the same as literal OK strtod ("1.0000000000000001111") 0x1.0000000000001p+0 is the same as literal OK strtod ("1.000000000000000111") 0x1p+0 is the same as literal OK strtod ("1.00000000000000011") 0x1p+0 is the same as literal OK strtod ("1.0000000000000001") 0x1p+0 is the same as literal [jsalmon@river lexical_cast]$

Subject: Re: Incorrect rounding in strtod() On Sun, 4 Feb 2007, john at thesalmons dot org wrote: > The attached code checks this condition. It's *much* easier to check > this than to check the more detailed requirement of 7.20.1.3. As a > result, the code doesn't actually prove that glibc is in error. It's > possible that gcc's handling of literal floats is in error. We know GCC's handling of floating-point literals doesn't always get the last bit rounded correctly, but showing this requires specially chosen examples (that I could only find for IEEE quad long double). So the cases here probably don't result from a GCC bug - but it might still be wise to write such tests so that GCC only needs parse a hex float value, so as to disentangle the glibc and GCC issues. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21718

There are two issues here: compile-time and run-time match, and correct rounding. The second issue is one way to achieve the first: if both environments round correctly all given digits, the results will match no matter how this is implemented. A more common approach is to reuire the use of the same library routine in both environments -- but that is difficult to guarantee as libraries evolve, possibly in the direction of supporting more correctly-rounded digits. Another method is to legislate some imperfect method, e.g. first round (in decimal) to some standard number of digits, then convert that with correct rounding to the target binary format. It looks like gcc took the first approach (using gmp as its high-precision engine), but has a simple bug: an invalid shortcut that fails for numbers that are very close to a rounding threshold, as is John's example. My program PHL converts to various precisions; K=Binary128, J=Binary64 (aka double), and I=Binary32 (aka float); the L modifier overrides the natural length of the format, and I use it show bits way beyond the rounding point. The number is indeed slightly larger than a half-way float, but it takes extended precision to see it; 10-byte Intel long double (a format that I don't support directly) looks like it would just show one non-zero tail bit. The input string is: "1.00000005960464477550" PHL K 1.00000005960464477550= * 3FFF0000 01000000 00020482 42F2FF67 PHL J 1.00000005960464477550= * 3FF00000 10000000 PHL JL16 1.00000005960464477550= * 3FF00000 10000000 00204824 2F2FF66C PHL IL16 1.00000005960464477550= * 3F800000 80000000 01024121 797FB363 PHL IL8 1.00000005960464477550= * 3F800000 80000000 PHL I 1.00000005960464477550= * 3F800001 Btw, such numbers can be generated (for any precision) by using Continued Fraction expansions of ratios of powers of two and five. Michel

I was not sufficiently clear in Comment #2. The fact that the code prints FAIL in certain cases does not demonstrate whether gcc's literal parser or glibc's strtof is faulty. It only proves that (at least) one of them is failing to adhere to "Recommended Practice" which says they should be the same. HOWEVER, careful inspection of the output, as well as knowledge of how the particular test case was chosen makes it pretty clear that *for this case* gcc is doing the right thing and strtof is at fault. gcc's rounding may not be perfect, but that's not at issue here. The input, "1.00000005960464477550" is the result of printf("%.21g", x) where x = 1 + FLT_EPSILON/2 + LDBL_EPSILON. I.e., it is specifically constructed to be *just a little bit larger* than the mid-point between two representable floats. Correct rounding should be to the larger of the two nearby floats, i.e., 1+FLT_EPSILON. gcc correctly constructs the literal 0x1.00002p0. strof incorrectly returns the value 1.0p0.

Created attachment 1550 [details] bz3479.c It seems that both GCC and glibc are buggy, even on floats.

Besides the "Recommended practice" mentioned, Annex F on IEC 60559 floating-point arithmetic, also says : "Functions such as strtod that convert character sequences to floating types must honor the rounding direction." It seems like glibc does not honor this, as the following program illustrates by crashing (on x86_64-linux at least). #include <stdlib.h> #include <assert.h> #include <fenv.h> int main() { fesetround(FE_UPWARD); double d = strtod("0.3", (char**) NULL); fesetround(FE_DOWNWARD); double e = strtod("0.3", (char**) NULL); assert(d != e); }

The fact that the rounding direction is not taken into account could be seen as a different bug. Concerning printf, see bug 5044.

I agree, and I also wondered about this before posting my comment, but I decided to leave this up to the bug database maintainers to eventually split it, as they are so close and I am not familiar with the local policy. Nevertheless, there is also a closely related point in my comment, which is that in default rounding mode (to nearest), the Annex F says that the rounding should be done to the nearest. And I understand that this applies to *all input*, so it seems to me that this supersedes the "recommended practice" paragraph which restricts the application to only some input (those with less significant digits than DECIMAL_DIG).

Regarding the example 3.518437208883201171875e+013, I think you can see what's going on better by looking at its value in "pure" binary: 1000000000000000000000000000000000000000000000.00000011 It has 54 significant bits, with bit 54 equal to 1 -- a halfway case. Since bit 53 is also 1, it should round up to 1000000000000000000000000000000000000000000000.000001 I've written an article describing this and other examples of incorrectly rounded conversions: http://www.exploringbinary.com/incorrectly-rounded-conversions-in-gcc-and-glibc/ .

*** Bug 11877 has been marked as a duplicate of this bug. ***

Is there any reason not to fix this by removing the limit on digits considered? Or by upping it to (say) the maximum binary exponent?

*** Bug 14046 has been marked as a duplicate of this bug. ***

Doing this correctly in the naive way requires some expensive extended-precision arithmetic. I think one (possibly) less expensive solution can be found in John Salmon's citation of the recommended practice. If you compute the result based on DECIMAL_DIG digits in the existing manner, but do it for BOTH the neighboring values L and U. If the results for L and U are equal, you're done. If not, you have to determine which of the two D is closer to (for default rounding mode; other rounding modes are much easier) and this probably involves converting at least one of them back to decimal, which might cost you more than just doing the full-precision original calculation...

The double-conversion routines extracted from v8 (Chrome's JS engine) are pretty fast and handle corner cases correctly: http://code.google.com/p/double-conversion It would probably require some engineering effort to translate the code from C++ to C and to adapt it to all of libc's requirements (for example the library always rounds to even), but it might serve as a decent starting-point. Another hurdle might be the requirement for a cache of precomputed 64bit values. However, some size could probably be traded against speed.

I expect that simply using the existing GMP-based code but with a correct calculation of how many digits need considering would suffice to fix the present problem. Rounding according to the current rounding mode is a separate issue that really ought to have its own bug.

Working on a patch (for the bad results for round-to-nearest, that is; not the lack of rounding mode support in strtod etc. which should have a separate bug filed).

I've filed bug 14518 for the issue (comment 6) with non-default rounding modes being ignored.

I'm on vacation, returning August 29.

Fixed for 2.17 by: commit af92131a8eb7c2661a5bb0e31dc4cb028c85e0c6 Author: Joseph Myers <joseph@codesourcery.com> Date: Mon Aug 27 16:01:27 2012 +0000 Fix strtod rounding (bug 3479).

The fix tests well (http://www.exploringbinary.com/correctly-rounded-conversions-in-gcc-and-glibc/ ), but has one very minor bug (http://sourceware.org/bugzilla/show_bug.cgi?id=16151 ).

*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.