This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [PATCH] Improve performance of strncpy

From: "Wilco Dijkstra" <wdijkstr at arm dot com>
To: "'Rich Felker'" <dalias at libc dot org>, "Florian Weimer" <fweimer at redhat dot com>
Cc: <azanella at linux dot vnet dot ibm dot com>, <libc-alpha at sourceware dot org>
Date: Wed, 10 Sep 2014 19:25:21 +0100
Subject: RE: [PATCH] Improve performance of strncpy
Authentication-results: sourceware.org; auth=none
References: <001301cfcd0a$f0b62670$d2227350$ at com> <54108BB0 dot 90902 at redhat dot com> <20140910180144 dot GK23797 at brightrain dot aerifal dot cx>

> Rich Felker wrote:
> On Wed, Sep 10, 2014 at 07:34:40PM +0200, Florian Weimer wrote:
> > On 09/10/2014 05:21 PM, Wilco Dijkstra wrote:
> > >Yes, you're right, I timed it and there is actually little difference, while
> > >the code is now even simpler. New version below (not attaching results in bad
> > >characters due to various mail servers changing line endings).
> > >
> > >OK for commit?
> >
> > I think you could simplify it down to strnlen, memcpy, and memset.
> 
> I don't think that's an improvement, at least not in the general case.
> It involves iterating twice over the source string, which for long
> strings could mean blowing the whole cache twice and fetching from
> main memory twice. There's a good reason that string operations are
> usually implemented to perform the copy and length computation
> together in a single pass.
> 
> Rich

Few strings will be larger than the typical L1 size of 32KB. You're right
that it is best to do a single pass in a highly optimized implementation.
However the issue is that the C versions are so slow that even doing 2
passes will be significantly faster due to processing 8 bytes at a time - 
likely even if much larger than L1 (I'll check that).

The goal of these patches is to ensure the C string routines are quite
competitive out of the box, and benefit further when you add a few highly
optimized routines (eg. strlen/strcpy). That means new targets are not
forced to add optimized versions of all of the string routines in order to
get decent performance (as unfortunately is the case today).

Wilco

References:
- RE: [PATCH] Improve performance of strncpy
  - From: Wilco Dijkstra
- Re: [PATCH] Improve performance of strncpy
  - From: Florian Weimer
- Re: [PATCH] Improve performance of strncpy
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]