This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: V2 [PATCH] x86: Add <sys/platform/x86.h>


On 08/29/2018 05:57 PM, H.J. Lu wrote:

CPUID and XGETBV do not need relocations, so they will work where this
approach will not.

Do you have a testcase to show that this claim is true?

I expect the effect to be similar to this bug:

  <https://bugzilla.redhat.com/show_bug.cgi?id=1377895>

And this one, which has more analysis:

  <https://sourceware.org/bugzilla/show_bug.cgi?id=20019>

This shows that the IFUNC resolvers inside libc itself (which use the proposed mechanism) are unsafe.

You also need to check for XSAVE support, I think.  This is easy to miss,
and the proposed interface does not prevent this.

That is why I added:

@defmac HAS_ARCH_FEATURE(name)

There should only be a single feature to query, to avoid confusion.

I think it's not at the right level of abstraction.  Features should only be
marked as available if you can actually execute the CPU instructions, and
they they will not fault due to missing CPU, kernel, or hypervisor support.

Do you have a testcase to show that this claim is true?

glibc on Red Hat Enterprise Linux 6 before glibc-2.12-1.207.el6 did not have correct FMA3 detection. It did not consider XOSSAVE support. As a result, the fma function would crash with SIGILL when running on a FMA3-capable CPU on a Red Hat Enterprise Linux 5 kernel.

The key notes from the private bug (#1384281) are:

“
The crash happens at an FMA3 instruction.

Program received signal SIGILL, Illegal instruction.
__fma_fma (x=1, y=2, z=3) at ../sysdeps/x86_64/multiarch/s_fma.c:33
33 asm ("vfmadd213sd %3, %2, %0" : "=x" (x) : "0" (x), "x" (y), "xm" (z));

#0  __fma_fma (x=1, y=2, z=3) at ../sysdeps/x86_64/multiarch/s_fma.c:33
#1  0x000000000041993c in fma_test ()
at /builddir/build/BUILD/glibc-2.12-2-gc4ccff1/build-x86_64-linuxnptl/math/libm-test.c:3125
#2  0x000000000043ba4a in main (argc=1, argv=<value optimized out>)
at /builddir/build/BUILD/glibc-2.12-2-gc4ccff1/build-x86_64-linuxnptl/math/libm-test.c:7198

The IFUNC selector is gated on:

# define HAS_FMA        HAS_CPU_FEATURE (COMMON_CPUID_INDEX_1, ecx, 12)

This is a glibc bug because it does not probe operating system support for AVX (using XGETBV), only FMA CPU support (using CPUID). The Intel guidelines are extremely clear that you need to check both.

It probably was fixed upstream in this commit:

commit afc5ed09cbce5d6fd48b3a8c5ec427b31f996880
Author: Ulrich Drepper <drepper@gmail.com>
Date:   Thu Jan 26 07:45:14 2012 -0500

    Reset bit_AVX in __cpu_features is OS support is missing

More fixes may be needed (possibly 08cf777f9e7f6d826658a99c7d77a359f73a45bf). I have not yet checked if commit afc5ed09cbce5d6fd48b3a8c5ec427b31f996880 is sufficient to make the HAS_FMA check fail (so that the SSE2 implementation is selected, as required by the RHEL 5 kernel due to its lack of AVX support).

Current upstream code is very different in this area.
”

Carlos and myself spend quite some time chasing this down, and I think with a proper interface design, we can spare other programmers that.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]