This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Compile AVX libm functions with -mavx


On Tue, Oct 2, 2012 at 4:45 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Oct 2, 2012 at 4:07 PM, Matt Turner <mattst88@gmail.com> wrote:
>> On Tue, Oct 2, 2012 at 1:19 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Oct 2, 2012 at 12:47 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>>>> On Tue, Oct 02, 2012 at 03:31:50PM -0400, Mike Frysinger wrote:
>>>>> On Tuesday 02 October 2012 15:20:54 H.J. Lu wrote:
>>>>> > On Tue, Oct 2, 2012 at 12:02 PM, Mike Frysinger <vapier@gentoo.org> wrote:
>>>>> > > On Tuesday 02 October 2012 09:53:25 H.J. Lu wrote:
>>>>> > >> This patch compiles AVX libm functions with -mavx.  It reduces text size
>>>>> > >
>>>>> > >> of libm.so by about 1%:
>>>>> > > looks like you're reverting 56f6f6a2403cfa7267cad722597113be35ecf70d.
>>>>> > > shouldn't you revert all of it and not just change the CFLAGS back ?
>>>>> >
>>>>> > Doesn't this patch:
>>>>> >
>>>>> > http://sourceware.org/ml/libc-alpha/2012-10/msg00055.html
>>>>> >
>>>>> > do that?
>>>>>
>>>>> yes, i missed the follow up
>>>>>
>>>>> > > it'd be useful to know *why* Ulrich moved away from -mavx, but
>>>>> > > unfortunately his commit message is useless.
>>>>> >
>>>>> > I can only guess:
>>>>>
>>>>> might be useful to put some notes (like referring to the older commit) into
>>>>> the commit message when you do commit things
>>>>> -mike
>>>>
>>>> could it be a 60 cycle penalty when switching between legagy sse and avx
>>>> state?
>>>
>>> This true. We can use -mprefer-avx128 to make sure that only 128bit AVX
>>> instructions are used.
>>>
>>> --
>>> H.J.
>>
>> The latency for switching between old SSE and new (AVX-style
>
> Latency comes from switching between the 128-bit SSE context and
> the 256-bit AVX context.  If we only use the lower 128-bit AVX context,
> there is no latency.

I'm having a hard time confirming that.

>From pages 53/54 of the pdf -- http://software.intel.com/file/36945 :

> However, there is a performance impact with intermixing VEX-encoded SIMD
> instructions (AVX, FMA) and legacy SSE instructions that only operate on
> the XMM register state.

And more to the point:

> Intermixed 256-bit, 128-bit or scalar SIMD instructions that are encoded
> with VEX prefixes have no transition delay due to internal state management.

>> 3-operand) form is what causes the penalty. What is the purpose of
>> -mprefer-avx128? I can't find a description of it online.
>
> I just fixed it:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54785
>
> -mprefer-avx128 will avoid 256-bit AVX instructions.  Only 128-bit
> AVX instructions are generated.  It has the same effect on context
> switch as -msse2avx.

I think that your claim is that legacy 128-bit SSE + 256-bit AVX
produces stalls, but I believe the documentation to say that it's
VEX-prefixed instructions in general (256-bit or otherwise) plus
legacy SSE instructions that lead to stalls.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]