This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Gcc builtin review: strcpy, stpcpy, strcat, stpcat?


> OndÅej BÃlka wrote:
> On Thu, Jun 04, 2015 at 02:50:07PM +0100, Wilco Dijkstra wrote:

> > The usual problem of knowing whether all targets define assembler versions of
> > stpcpy applies - so I don't think it is a good idea to change all strcpy into
> > stpcpy in general. The only useful case is strcpy(x,y)+strlen(x) which could
> > potentially give a major speedup.
> >
> Then its situation where it decision depends on implementation details,
> as on some architectures you could save some cycles with stpcpy itself.

Yes, I think the optimization to convert strcpy into stpcpy would need
to be done in a target specific way in GLIBC headers for targets where it
makes sense. It's not something you could easily do in GCC as stpcpy is
not a standard function. In general it is best to optimize to use simpler,
standard C90 functions (eg. mempcpy->memcpy eventhough mempcpy might
be a better ABI to standardize on).

> As useful cases, on gcc thread I said that gcc could use available
> length to convert strchr to memchr and similar optimizations so strcpy
> will be called more.
> 
> Then as I mentioned cache issues so far I measured mostly noise. I know
> that overall stpcpy is often five times less called than strcpy, so
> potential is there but it depends on actual savings when strcpy costs
> cycle less.
> Data about strcpy and stpcpy when running make of zlib with debian gcc-5
> are following:
> 
> ./summary_strcpy calls 52218 average n:   71.0    

> ./summary_stpcpy calls 4950 average n:    7.5

This says that stpcpy processes only 1% of the data that strcpy does,
so that means optimization of strcpy is 100 times more important. Ie.
slowing down strcpy just to share with stpcpy does not make any sense.

Also given the relatively small strings the generic version of stpcpy would 
be quite competitive already (the generic version using strlen+memcpy was
beating optimized strcpy/stpcpy implementations on several targets at the
time I made the change). So I'm just not convinced stpcpy needs a lot more
optimization.

Wilco



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]