This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: [PATCH] Inline mempcpy
- From: "Wilco Dijkstra" <wdijkstr at arm dot com>
- To: 'Ondřej Bílka' <neleai at seznam dot cz>
- Cc: "'Joseph Myers'" <joseph at codesourcery dot com>, <libc-alpha at sourceware dot org>, "'Carlos O'Donell'" <carlos at redhat dot com>, <munroesj at linux dot vnet dot ibm dot com>
- Date: Wed, 5 Aug 2015 16:41:07 +0100
- Subject: RE: [PATCH] Inline mempcpy
- Authentication-results: sourceware.org; auth=none
- References: <A610E03AD50BFC4D95529A36D37FA55E769A83F7F7 at GEORGE dot Emea dot Arm dot com> <000201d0915f$87bb15d0$97314170$ at com> <alpine dot DEB dot 2 dot 10 dot 1505181152420 dot 4225 at digraph dot polyomino dot org dot uk> <000301d09180$ef6e9eb0$ce4bdc10$ at com> <alpine dot DEB dot 2 dot 10 dot 1505181700040 dot 20209 at digraph dot polyomino dot org dot uk> <000501d092f3$93bbc8d0$bb335a70$ at com> <20150524155746 dot GB18976 at domone>
> Ondřej Bílka wrote:
> On Wed, May 20, 2015 at 12:53:24PM +0100, Wilco Dijkstra wrote:
> > > Joseph Myers wrote:
> > > On Mon, 18 May 2015, Wilco Dijkstra wrote:
> > >
> > > This seems plausible, subject to getting per-architecture agreement (for
> > > each architecture with mempcpy.S) on whether to define
> > > _HAVE_STRING_ARCH_mempcpy. Although there may be the question of whether
> > > __extern_always_inline should be defined at all for !__GNUC_PREREQ (3,2)
> > > (i.e. when the always_inline attribute isn't supported).
> >
> > It would be good to fix the *always_inline defines, but I for now I've added
> > an extra check for __GNUC_PREREQ (3,2) to be sure we don't fail to inline on
> > really old GCCs. So here's the actual patch - I've disabled inlining for SPARC,
> > and from previous comments it seems people prefer the inline mempcpy on x64/x86
> > and PPC (I've included the maintainers for those arches to agree/veto).
> >
> > OK for commit?
> >
> > Wilco
> >
> Adhemerval already acked this for powerpc in this thread.
> For x64 I obviously agree. I added optimized memcpy. As mempcpy I
> submitted patch, then forgotten about it after about ten pings.
>
> So on x64 on sandy bridge this will improve performance by around 50% on
> larger strings as you see even in benchtest
OK, I have now committed this in 2.23 (05a9..0606).
Wilco