This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix hypotf overflow/underflow (bug 13840)


From: "Joseph S. Myers" <joseph@codesourcery.com>
Date: Tue, 13 Mar 2012 20:46:17 +0000 (UTC)

> This patch fixes overflow and underflow problems in
> sysdeps/ieee754/flt-32/e_hypotf.c.  This scales by 2**60 in an attempt
> to avoid the internal x**x+y**y overflowing.  But as FLT_MAX is just
> under 2**128, scaling down by 2**60 isn't enough for large inputs.
> And while there is a separate case for subnormal inputs, for small
> normal inputs the scaling up by 2**60 still leaves x**x+y**y subnormal
> (so with loss of precision).
> 
> Tested x86_64 and x86 (no ulps updates required).  The TEST_INLINE
> conditionals are based on failures observed with the x86 inline
> implementation (it's possible that the test without such a conditional
> will need one as well depending on the exact compiler version /
> configuration in use, I just didn't find it to be needed in my
> testing).

This is some seriously inefficient code.  All of this scaling business
could be eliminated by simply casting the arguments to double and then
doing the straightforward:

	__ieee754_sqrt(double_x * double_x + double_y * double_y);

calculation and casting the result to float.  This code we currently
have is simply a way too literal conversion from the double version.

By using a cast to double all the scaling code can be removed, and all
that's left are the checks against 0x7f800000 and zero.

So, assuming 'ha' and 'hb' are calculated, the cases we have are:

1) ha == 0x7f800000

	if (x == y)
		return fabs(y);
	else
		return fabs(x);

2) hb == 0x7f800000

	if (x == y)
		return fabs(x);
	else
		return fabs(y);

3) if ha > 0x7f800000 or hb > 0x7fb80000

	return fabs(x) * fabs(y)

4) if x == 0, return y

5) if y == 0, return x

6) if none of the above apply:

	double d_x = (double) x;
	double d_y = (double) y;
	return (float) __ieee754_sqrt(d_x * d_x + d_y * d_y);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]