This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Inline mempcpy


On Tue, May 26, 2015 at 11:57:10AM +0100, Wilco Dijkstra wrote:
> > OndÅej BÃlka wrote:
> > On Wed, May 20, 2015 at 12:53:24PM +0100, Wilco Dijkstra wrote:
> > > > Joseph Myers wrote:
> > > > On Mon, 18 May 2015, Wilco Dijkstra wrote:
> > > >
> > > > This seems plausible, subject to getting per-architecture agreement (for
> > > > each architecture with mempcpy.S) on whether to define
> > > > _HAVE_STRING_ARCH_mempcpy.  Although there may be the question of whether
> > > > __extern_always_inline should be defined at all for !__GNUC_PREREQ (3,2)
> > > > (i.e. when the always_inline attribute isn't supported).
> > >
> > > It would be good to fix the *always_inline defines, but I for now I've added
> > > an extra check for __GNUC_PREREQ (3,2) to be sure we don't fail to inline on
> > > really old GCCs. So here's the actual patch - I've disabled inlining for SPARC,
> > > and from previous comments it seems people prefer the inline mempcpy on x64/x86
> > > and PPC (I've included the maintainers for those arches to agree/veto).
> > >
> > > OK for commit?
> > >
> > > Wilco
> > >
> > Adhemerval already acked this for powerpc in this thread.
> > For x64 I obviously agree. I added optimized memcpy. As mempcpy I
> > submitted patch, then forgotten about it after about ten pings.
> 
> Yes it people generally agree this is the right thing to do but I still
> need to get agreement from the arch maintainers for this specific patch.
> 
> > So on x64 on sandy bridge this will improve performance by around 50% on
> > larger strings as you see even in benchtest
> 
> Are you talking about your new memcpy here or the mempcpy patch? I don't think 
> My patch will affect performance simplistic tests - it reduces I-cache/BP 
> pressure which is not what benchtest tries to measure.
> 
It will affect performance that much as I explained. Situation now is
that memcpy is optimized but mempcpy is not. By using memcpy it also
becomes optimized which would lead to 50% gain as I said.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]