This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
- From: Carlos O'Donell <carlos at redhat dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 17 Oct 2016 11:44:03 -0400
- Subject: Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
- Authentication-results: sourceware.org; auth=none
- References: <90af3efa-aaca-2dce-e433-1df7e5dbcfbd@redhat.com> <CAMe9rOr5dmNN4EChF-5_a=5UhkfWgGUnG77P2Oy2+VE=UakGpg@mail.gmail.com>
On 10/14/2016 03:12 PM, H.J. Lu wrote:
> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> In the Intel Architecture Instruction Set Extensions Programming reference
>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>> is:
>>
>> "Application Software must identify that hardware supports AVX as explained
>> in ... after that it must also detect support for FMA..."
>>
>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>> after that we check for AVX and FMA orthogonally. It is conceivable that
>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>
>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>> we should follow the intel specifications and adjust our check as the following
>> patch works.
>
> One can't write a program with FMA nor AVX2 without using AVX
> instructions. AVX2/FMA aren't usable if AVX isn't. We should do
>
> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
> {
> Set AVX available
> if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> Set AVX2 available
> if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> Set FMA available
> }
>
Agreed. I double checked the manual, here is a new patch.
Testing on x86_64 and i686.
OK to commit if the results are clean?
v2
- Move FMA and AVX2 check up into AVX check since they depend upon it.
2016-10-14 Carlos O'Donell <carlos@redhat.com>
[BZ #20689]
* sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
AVX is usable.
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 11b9af2..e228a76 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -60,12 +60,20 @@ get_common_indeces (struct cpu_features *cpu_features,
{
/* Determine if AVX is usable. */
if (CPU_FEATURES_CPU_P (cpu_features, AVX))
- cpu_features->feature[index_arch_AVX_Usable]
- |= bit_arch_AVX_Usable;
- /* Determine if AVX2 is usable. */
- if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
- cpu_features->feature[index_arch_AVX2_Usable]
- |= bit_arch_AVX2_Usable;
+ {
+ cpu_features->feature[index_arch_AVX_Usable]
+ |= bit_arch_AVX_Usable;
+ /* The following features depend on AVX being usable. */
+ /* Determine if AVX2 is usable. */
+ if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
+ cpu_features->feature[index_arch_AVX2_Usable]
+ |= bit_arch_AVX2_Usable;
+ /* Determine if FMA is usable. */
+ if (CPU_FEATURES_CPU_P (cpu_features, FMA))
+ cpu_features->feature[index_arch_FMA_Usable]
+ |= bit_arch_FMA_Usable;
+ }
+
/* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and
ZMM16-ZMM31 state are enabled. */
if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state
@@ -83,10 +91,6 @@ get_common_indeces (struct cpu_features *cpu_features,
|= bit_arch_AVX512DQ_Usable;
}
}
- /* Determine if FMA is usable. */
- if (CPU_FEATURES_CPU_P (cpu_features, FMA))
- cpu_features->feature[index_arch_FMA_Usable]
- |= bit_arch_FMA_Usable;
}
}
}
---
--
Cheers,
Carlos.