[PATCH RFC] Imporve 64bit memset performance for Haswell CPU with AVX2 instruction

H.J. Lu hjl.tools@gmail.com
Thu Jun 19 22:17:00 GMT 2014


On Thu, Jun 19, 2014 at 12:12 PM, Ondřej Bílka <neleai@seznam.cz> wrote:
> On Wed, Jun 18, 2014 at 09:47:11AM -0700, H.J. Lu wrote:
>> On Tue, Jun 10, 2014 at 6:52 AM, Ling Ma <ling.ma.program@gmail.com> wrote:
>> > In this patch as gziped attachment, we take advantage of HSW memory
>> > bandwidth, manage to reduce miss branch prediction by avoiding using
>> > branch instructions and
>> > force destination to be aligned with avx & avx2 instruction.
>> >
>> > The CPU2006 403.gcc benchmark indicates this patch improves performance
>> > from 26% to 59%.
>> >
>> > This version accept Ondra's comments and avoid branch instruction to
>> > cross 16byte-aligned code.
>>
>> Any feedback?  I'd like to check it in before 2.20 code freeze.
>>
> As I said before its ok with fixed formatting, you could commit it if
> you wish.

This is the patch I checked in with sysdeps/x86_64/multiarch/rtld-memset.S
added.

Thanks.


-- 
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-x86_64-memset-optimized-for-AVX2.patch
Type: text/x-patch
Size: 10756 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20140619/edfae7ee/attachment.bin>


More information about the Libc-alpha mailing list