This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] stdlib-bsearch: middle element calculation may overflow


On Thu, Mar 16, 2017 at 11:32 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> so I modified the test I posted earlier. moved array to .bss, made array
> size reasonable (so there is no overflow). each ./a.out executes bsearch()
> 10000000 times.

Thanks for this.  It might also be useful to show the codegen
difference, which (on x86-64) is

-bsearch_stock:
+bsearch_patched:
 .LFB10:
     .cfi_startproc
     pushq    %r15
@@ -43,9 +43,11 @@
     cmpq    %r13, %r14
     jnb    .L6
 .L9:
-    leaq    (%r14,%r13), %rbx
+    movq    %r13, %rbx
     movq    (%rsp), %rdi
+    subq    %r14, %rbx
     shrq    %rbx
+    addq    %r14, %rbx
     movq    %rbx, %r15
     imulq    %r12, %r15
     addq    8(%rsp), %r15

This probably hurts by extending the critical dependency chain, as
well as by issuing more instructions.  I would like to know what the
codegen difference looks like on more RISCy architectures (maybe MIPS
and AArch64?) -- I have to wonder whether a CPU that didn't have
complex LEA would wind up generating essentially the same code either
way.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]