This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PowerPC LE strlen


On Tue, Aug 13, 2013 at 03:53:48PM -0500, Will Schmidt wrote:
> > -	nor	rTMP1, rTMP2, rTMP1
> > -	and.	rWORD1, rTMP1, rMASK
> 
> > +	nor	rTMP3, rTMP2, rTMP1
> > +	and.	rTMP3, rTMP3, rMASK
> 
> ^ For this and related changes, is this clean-up such that it's easier
> to read, or is there an underlying improvement in how we were using the
> involved registers? 

The LE tail uses the result of the "and"s in the main loop.  Since
they were both originally in rTMP1, I changed the "and" result for the
second word, thinking that was necessary.  It isn't in the case of
non-power7 strlen (a fact I only just realised) because we have two
distinct tails, in contrast with many other string/memory functions
that handle two or more words in the main loop yet have a single exit
from the loop.  However, it is necessary to renumber rTMP1 to
something other than r0 since I want to subtract one from rTMP1 in the
LE tail and "addi" is preferable to "addic".  It's also necessary to
change regs used in the entry path so that the "and" results are in
the same reg as that in the loop.

So one of the changes here is a consequence of poking at a number of
other functions before I looked at non-power7 strlen.  I'm not aware
of any case where using a different gpr produces different timing.

-- 
Alan Modra
Australia Development Lab, IBM


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]