This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Optimized generic expf and exp2f
On Wed, 6 Sep 2017, Arjan van de Ven wrote:
> interesting; it takes 2 independent FP adds and a compare (in C) to
> detect nearest rounding being in effect (which in time can overlap with
> the float->double conversion) so if there's an option to reduce the
> algorithm by more than that for a fast path...
My understanding from the patch submission was that round-to-nearest was
only relevant for the conversion to nearest integer. (The rest of the
calculation is done in double, so intermediate rounding errors are of no
significance there.)
The TOINT_RINT and TOINT_SHIFT cases depend on round-to-nearest to get an
actual nearest integer result (otherwise we get the larger errors
discussed from applying a polynomial to a larger range than it was
optimized for). TOINT_INTRINSICS does whatever those intrinsics do which
might round to nearest rather than the current rounding mode. Use of
round or roundeven instead of rint would also be possible. (GCC doesn't
have built-in roundeven at present, but it would make sense to add.
SSE4.1 supports encoding all four binary IEEE rounding modes in the
instruction; glibc only has floor/ceil IFUNCs using that facility, not
trunc (bug 20142) or roundeven, and likewise for math_private.h __floor
etc. inlines.) Adding a constant so that the value is always positive and
then casting to int (defined to truncate towards 0) is also a possibility.
> (also, some CPUs (like newer Intel) support an instruction prefix
> encoding to force rounding modes on a FP instruction independent of the
> global rounding mode, which at some point maybe should be a gcc pragma
> or attribute or something, and then used in such C code)
That's #pragma STDC FENV_ROUND <direction> from TS 18661-1. That makes
sense to implement (it has both compiler and library aspects), but
probably can't work reliably without first sorting out (conditional on
appropriate compiler options) the general issues with optimizations not
respecting exceptions / rounding modes. And using it would be slow on
architectures without the hardware support for constant rounding modes in
instructions (at least in the case where changing rounding modes is slow),
as it needs to insert dynamic rounding mode saves and restores in that
case.
--
Joseph S. Myers
joseph@codesourcery.com