This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] x86-64: Add memcmp/wmemcmp optimized with AVX2


On 06/01/2017 11:29 PM, H.J. Lu wrote:
> L(between_4_7):
>         movl    (%rdi), %r8d
>         movl    (%rsi), %ecx
>         shlq    $32, %r8
>         shlq    $32, %rcx
>         movl    -4(%rdi, %rdx), %edi
>         movl    -4(%rsi, %rdx), %esi
>         orq     %rdi, %r8
>         orq     %rsi, %rcx
>         bswap   %r8
>         bswap   %rcx
>         cmpq    %rcx, %r8
>         je      L(zero)
>         sbbl    %eax, %eax
>         orl     $1, %eax
>         ret
> 
> and got
> 
> Iteration 70485 - wrong result in function __memcmp_avx2 (18, 26, 5,
> 0) -1 != 1, p1 0x7ffff7ff0e00 p2 0x7ffff7fece00
> 
> Where did I do wrong?

I think you created some PDP-endian thing there.  The 4 bytes at (%rdi)
need to remain in the lower part of %r8, up until the bswap.  In other
words, you need to shift the 4 bytes loaded from -4(%rdi, %rdx).

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]