This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug math/21912] Optimize float math functions with FMA
- From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Wed, 16 Aug 2017 15:56:21 +0000
- Subject: [Bug math/21912] Optimize float math functions with FMA
- Auto-submitted: auto-generated
- References: <bug-21912-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=21912
--- Comment #21 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".
The branch, hjl/fma/2.26 has been updated
via 6d5f5b16bc4bd3945e138509d7986a5231ab5ee6 (commit)
via ce3e7f4136a9f5943328c74511542834ca05811b (commit)
from 7e7b5de8ffc9ac8fda45b988cde5650004bdbca7 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6d5f5b16bc4bd3945e138509d7986a5231ab5ee6
commit 6d5f5b16bc4bd3945e138509d7986a5231ab5ee6
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Wed Aug 16 08:43:35 2017 -0700
x86-64: Optimize e_expf with FMA [BZ #21912]
FMA optimized e_expf improves performance by more than 50% on Skylake.
[BZ #21912]
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
Add e_expf-fma.
* sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: New file.
* sysdeps/x86_64/fpu/multiarch/e_expf.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/ifunc-fma.h: Likewise.
(cherry picked from commit 24a2e6588d2e0c91b4003878b0625d4a9360e8f3)
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ce3e7f4136a9f5943328c74511542834ca05811b
commit ce3e7f4136a9f5943328c74511542834ca05811b
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Aug 15 14:04:59 2017 -0700
x86-64: Align L(SP_RANGE)/L(SP_INF_0) to 8 bytes [BZ #21955]
sysdeps/x86_64/fpu/e_expf.S has
lea L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */
cmpl (%rdx,%rax,4), %ecx /* |x|<under/overflow bound ? */
...
/* Here if |x| is Inf */
lea L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */
movss (%rdx,%rax,4), %xmm0 /* return zero or Inf */
ret
...
.section .rodata.cst8,"aM",@progbits,8
...
.p2align 2
L(SP_RANGE): /* single precision overflow/underflow bounds */
.long 0x42b17217 /* if x>this bound, then result overflows
*/
.long 0x42cff1b4 /* if x<this bound, then result underflows
*/
.type L(SP_RANGE), @object
ASM_SIZE_DIRECTIVE(L(SP_RANGE))
.p2align 2
L(SP_INF_0):
.long 0x7f800000 /* single precision Inf */
.long 0 /* single precision zero */
.type L(SP_INF_0), @object
ASM_SIZE_DIRECTIVE(L(SP_INF_0))
Since L(SP_RANGE) and L(SP_INF_0) are in .rodata.cst8 section, they must
be aligned to 8 bytes.
[BZ #21955]
* sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Aligned to 8 bytes.
(L(SP_INF_0)): Likewise.
(cherry picked from commit f59f7adb4a00b7784cab1becdf257366104587b7)
-----------------------------------------------------------------------
Summary of changes:
sysdeps/x86_64/fpu/e_expf.S | 4 +-
sysdeps/x86_64/fpu/multiarch/Makefile | 3 +
sysdeps/x86_64/fpu/multiarch/e_expf-fma.S | 182 +++++++++++++++++++++++++++++
sysdeps/x86_64/fpu/multiarch/e_expf.c | 26 ++++
sysdeps/x86_64/fpu/multiarch/ifunc-fma.h | 34 ++++++
5 files changed, 247 insertions(+), 2 deletions(-)
create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf-fma.S
create mode 100644 sysdeps/x86_64/fpu/multiarch/e_expf.c
create mode 100644 sysdeps/x86_64/fpu/multiarch/ifunc-fma.h
--
You are receiving this mail because:
You are on the CC list for the bug.