This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
- From: Yuriy Kaminskiy <yumkam at gmail dot com>
- To: libc-alpha at sourceware dot org
- Cc: ling dot ma dot program at gmail dot com
- Date: Sat, 19 Apr 2014 17:22:22 +0400
- Subject: Re: [PATCH RFC] Imporve 64bit memcpy performance for Haswell CPU with AVX instruction
- Authentication-results: sourceware.org; auth=none
- References: <1396595862-21707-1-git-send-email-ling dot ma dot program at gmail dot com> <20140410225018 dot GD9478 at domone dot podge> <CAOGi=dPuTq0QuEe73DkkvQsLzhejtyCyDULOydELx-C0Mq0ZTw at mail dot gmail dot com>
Ling Ma wrote:
[...]
>>> +#ifdef USE_AS_MEMMOVE
>>> + cmp %rsi, %rdi
>>> + jae L(copy_backward)
>>> +#endif
>> this could be unpredictable branch, backward copy only when overlap is
>> better.
>
> If we compare whether it is overlap, have to introduce another branch
> instruction, so keep it.
No, combined overlap/backward copy detection needs only *single* branch
instruction (but some extra arithmetic). See generic C code or i686 assembler
version:
=== cut string/memmove.c ===
/* This test makes the forward copying code be used whenever possible.
Reduces the working set. */
if (dstp - srcp >= len) /* *Unsigned* compare! */
{
/* Copy from the beginning to the end. */
=== cut sysdeps/i386/i686/memmove.S ===
[...]