This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Review decision to inline mempcpy to memcpy.
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, Ondrej Bilka <neleai at seznam dot cz>, "Joseph S. Myers" <joseph at codesourcery dot com>, Wilco Dijkstra <wdijkstr at arm dot com>, Jakub Jelinek <jakub at redhat dot com>, Jeff Law <law at redhat dot com>
- Date: Thu, 3 Mar 2016 12:55:38 -0800
- Subject: Re: Review decision to inline mempcpy to memcpy.
- Authentication-results: sourceware.org; auth=none
- References: <56D856F2 dot 4020000 at redhat dot com>
On Thu, Mar 3, 2016 at 7:23 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> The upstream gcc PR/70055 requests that glibc revert commit
> 05a910f7b420c2b831f35ba90e61c80f001c0606 and instead work
> with gcc to make the builtin mempcpy better (for various
> aspects of performance).
>
> The crux of the argument is that the compiler may be able
> to do a better job of optimizing if it knows the call was
> a mempcpy as opposed to memcpy + addition.
I opened:
https://sourceware.org/bugzilla/show_bug.cgi?id=19759
> I understand the need of glibc machine maintainers to produce
> a library whose performance is as optimal as they can given
> the compilers we have today. This leads to decisions like those
> we made to transform mempcpy to memcpy + addition.
>
> I also understand the worries that the compiler developers see
> in such a transformation. Information has been lost to the
> compiler that might generated better code.
>
> Is there a middle ground here? Should machines with mempcpy.S,
> particularly x86_64 define _HAVE_STRING_ARCH_mempcpy in their
> string.h? I think this question wasn't clearly answered before
> the patch went in. However, the microbenchmarks show this is
> a clear gain given modern compilers for x86_64. Is this because
> x86_64's mempcpy.S is flawed? Does it need to be fixed as
> opposed to transforming mempcpy to memcpy+addition?
>
> Ondrej,
>
> Did you file gcc bugs to revuew the optimization issues
> with current mempcpy as suggested in [2] and [3]?
>
> Wilco,
>
> Were the changes in glibc to optimize mempcpy as memcpy
> originally motivated by performance for ARM?
Can we make it opt-in?
> Particularly because ARM does not have an optimized
> mempcpy implementation in glibc?
It is 3 lines of code for x86-64:
#ifdef USE_AS_MEMPCPY
add %rdx, %rax
#endif
How hard will ARM be?
--
H.J.