This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH]: Performance improve on ARM memset


On Tue, 2008-10-28 at 22:08 -0400, Daniel Jacobowitz wrote:
> On Tue, Oct 28, 2008 at 05:00:10PM -0700, Min Zhang wrote:
> > This patch improves the execution time of the memset.  Tested by "time"  
> > shell utility on the following test program. The patch reduced execution  
> > time by 50%. Also sanity tested the memset with length from 0 byte to  
> > 1000 bytes, just to make sure it doesn't memset any extra or less bytes.
> 
> Thanks for the patch.
> 
> On what processor did you time it?  You're probably optimizing for a
> different processor than Philip was; the difficulty with changing
> memory functions is not slowing them down for a different CPU or block
> size.
> 
> Unfortunately, Phil didn't say where he tested; we'll probably have to
> benchmark this change on a couple of targets.

I tried a quick test on arm920t and StrongARM (being the two platforms I
happened to have on hand) and couldn't detect any measureable difference
with and without the patch.  So, on those two, it seems to be basically
a wash.

I think the original removal of STM was motivated by PXA25x
specifically, though the same issue probably exists for all XScale-based
processors.  See for example section A.5.1.2 of the "Intel XScale Core
Developer's Manual" which describes the timings for STM and LDM relative
to other kinds of loads and stores.  Unfortunately I don't have any
working PXA255 hardware on my desk right now so I can't easily re-run
the benchmark on that platform, but judging from the documentation it
does look like STM will be significantly slower for writes that hit in
the dcache.  For writes that miss in the cache I think the difference
will probably be lost in the noise since the memory bus bandwidth will
the the limiting factor.

p.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]