This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction

From: Ling Ma <ling dot ma dot program at gmail dot com>
To: OndÅej BÃlka <neleai at seznam dot cz>
Cc: "H.J. Lu" <hjl dot tools at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>, Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>, yumkam at gmail dot com, Ling Ma <ling dot ml at alibaba-inc dot com>
Date: Mon, 14 Jul 2014 12:28:28 +0800
Subject: Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
Authentication-results: sourceware.org; auth=none
References: <CAOGi=dOJX3saKoa5YiDdveOqAb_=Sev4cBKyh7_gkXBU8_4=+g at mail dot gmail dot com> <CAMe9rOpEhNffr5iZUZLFp4QyBAE-Xrxna8-BQFv=tZXEXdSLSg at mail dot gmail dot com> <CAOGi=dNk7H2+aWh=+3_qwVH9LvWN-eNKcLciW=0J7x1dVL9v+g at mail dot gmail dot com> <CAOGi=dMsSdQi8SuXi2pzCbMm6bCrwJru0rAjtg=cn24CLgOgRg at mail dot gmail dot com> <CAMe9rOqZpj4BE7kXABOAueaD-o1PgRjL_R48KeDcJBDSmHXPdg at mail dot gmail dot com> <20140625163416 dot GA14763 at domone dot podge> <CAOGi=dMn+zr3u_1YJvmxOO0NF9BTGKeCJNV0nkDTBd7x2dx4eg at mail dot gmail dot com> <CAOGi=dPjVosbXjX9k2kB_o_dsDpHk6DAZXMjyqEVXS3g-dpejA at mail dot gmail dot com> <20140710133648 dot GA18783 at domone dot podge> <CAOGi=dM6iALQNc40=2rpojj49is9e5ms44phtVcrpcwBPAhNbQ at mail dot gmail dot com> <20140711095404 dot GA4897 at domone dot podge>

In this patch we take advantage of HSW memory bandwidth, manage to
reduce miss branch prediction by avoiding using branch instructions and
force destination to be aligned with avx instruction.

The CPU2006 403.gcc benchmark indicates this patch improves the whole
performance from 2% to 10%, and 12~ 60% when copy size is over 256bytes.

This version is based on latest ChangeLog,  and simplified memmove
according to Ondra's comment.

Thanks
Ling

Attachment: memcpy-avx-unaligned.patch.tar.gz
Description: GNU Zip compressed data

Follow-Ups:
- Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
  - From: Ling Ma

References:
- Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
  - From: Ling Ma
- Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
  - From: OndÅej BÃlka
- Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
  - From: Ling Ma
- Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]