On Wed, 20 Dec 2017, Patrick McGehearty wrote:
The tgamma failures disappeared, giving strong support to Joseph's hypothesis.
For x86, there is roughly a 5% performance cost which would be tolerable.
Unfortunately, for Sparc, there is a 76% performance cost which is
less tolerable. When Sparc changes the rounding mode, the instruction pipeline
But is presumably still better than the existing code (as you mentioned a
5x improvement), so is a reasonable incremental step.
3) Define a macro either within e_exp.c or in an include file that selects
get_rounding_mode and libc_fesetround for all platforms except x86.
It selects SET_RESTORE_ROUND for x86.
Putting platform specific macros inside ieee754 branch seems
undesirable, but I thought I should mention it as a possibility.
The correct thing to do is as I said: add libc_fegetround,
libc_fegetroundf and libc_fegetroundl to the large set of math_private.h /
fenv_private.h libc_fe* macros. All of these would default to using
get_rounding_mode, but sysdeps/i386/fpu/fenv_private.h would, in the
__SSE_MATH__ case, use the SSE rounding mode for libc_fegetroundf, and in
the __SSE2_MATH__ case use it also for libc_fegetround. Then you could
use libc_fegetround where you previously used get_rounding_mode.
However, using SET_RESTORE_ROUND as an incremental step still makes sense
before adding libc_fegetround* as an improvement on top of that.