This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] x86_64: memset optimized with AVX512
- From: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>
- To: libc-alpha <libc-alpha at sourceware dot org>
- Date: Fri, 18 Dec 2015 17:06:16 +0300
- Subject: Re: [PATCH] x86_64: memset optimized with AVX512
- Authentication-results: sourceware.org; auth=none
- References: <CAMXFM3v7K1bO8aUB=rVBzKqAZJuNrMBh5NuyszONtB-sZrXcgA at mail dot gmail dot com> <CAMe9rOrmhNzZTjX9XS2ONhTghCU-UXs3WmWzRWkXhWHCDZi8GQ at mail dot gmail dot com> <CAMXFM3uDEgQyfmJtVzp8QqYCEFJpYMTENhG3RMu26_R6eeAs3g at mail dot gmail dot com> <CAMe9rOrvjXqk6F-fWBy_f4R9Rw0+yzkeb5T63OXkAPiw7OXnGw at mail dot gmail dot com> <CAMXFM3vHyJXK=AWcgBSUWHE7HTHSvLER6DVx2joGdTBFEC528Q at mail dot gmail dot com> <CAMe9rOrx5Hxw8Dxrw6Cmb1WvaDr=LZpsqdZ9ytDXLAgeHTRvGw at mail dot gmail dot com> <CAMXFM3t74ne+uqX+_QFO-EVaU1dBjGsSQFuj3OpVxyav5bHHCg at mail dot gmail dot com> <CAMe9rOqXrjLFycCWcPuVW+3GtVK15cyZdqt4rz5002D2PuiFsQ at mail dot gmail dot com> <CAMe9rOpUR+mtke=3p2NLkD_kKWVRBucUm81sphuvaS2-Wg0vsQ at mail dot gmail dot com> <CAMXFM3v6BMXYGD_EbJO_-GqKaEMaeCDHdrd9L+C8wxJRuofBvg at mail dot gmail dot com> <CAMe9rOpuw-GDwL8WSTegHosx242zws6QXB6Q+e7e1L_g8Ymv5g at mail dot gmail dot com>
2015-12-16 20:19 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
> On Fri, Dec 11, 2015 at 10:13 AM, Andrew Senkevich
> <andrew.n.senkevich@gmail.com> wrote:
>> 2015-12-11 17:43 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>> Please make following changes:
>>>>
>>>> 1. Change _avx512 to _avx512_no_vzeroupper.
>>>> 2. Add a feature, Prefer_No_VZEROUPPER, to cpu-features.h, and set
>>>> it for KNL.
>>>> 3. Check Prefer_No_VZEROUPPER instead of AVX512DQ_Usable,
>>>> 4. Don't check AVX512DQ_Usable nor Prefer_No_VZEROUPPER in
>>>> ifunc-impl-list.c.
>>>>
>>>
>>> I submitted a patch to enable SLM optimization for KNL:
>>>
>>> https://sourceware.org/ml/libc-alpha/2015-12/msg00221.html
>>>
>>> It is on hjl/32bit/master branch. Please rebase your patch against
>>> mine since it adds KNL optimization.
>>
>> Here ir rebased and updated version:
>>
>> From f488d8572bc43c731b0ce054ce1f84db7d90eb61 Mon Sep 17 00:00:00 2001
>> From: Andrew Senkevich <andrew.senkevich@intel.com>
>> Date: Fri, 11 Dec 2015 20:58:57 +0300
>> Subject: [PATCH] Added memset optimized with AVX512 for KNL hardware.
>>
>> It shows improvement up to 28% over AVX2 memset (performance results
>> attached at <https://sourceware.org/ml/libc-alpha/2015-12/msg00052.html>).
>>
>> * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: New file.
>> * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new file.
>> * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests.
>> * sysdeps/x86_64/multiarch/memset.S: Added new IFUNC branch.
>> * sysdeps/x86_64/multiarch/memset_chk.S: Likewise.
>> * sysdeps/x86/cpu-features.h (bit_Prefer_No_VZEROUPPER,
>> index_Prefer_No_VZEROUPPER): New.
>> * sysdeps/x86/cpu-features.c (init_cpu_features): Set the
>> Prefer_No_VZEROUPPER for Knights Landing.
>
> Looks good to me.
>
> Thanks.
Is it OK for trunk?
--
WBR,
Andrew