This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH]: Performance improve on ARM memset


Phil Blundell wrote:
I tried a quick test on arm920t and StrongARM (being the two platforms I
happened to have on hand) and couldn't detect any measureable difference
with and without the patch.  So, on those two, it seems to be basically
a wash.

I think the original removal of STM was motivated by PXA25x
specifically, though the same issue probably exists for all XScale-based
processors.  See for example section A.5.1.2 of the "Intel XScale Core
Developer's Manual" which describes the timings for STM and LDM relative
to other kinds of loads and stores.  Unfortunately I don't have any
working PXA255 hardware on my desk right now so I can't easily re-run
the benchmark on that platform, but judging from the documentation it
does look like STM will be significantly slower for writes that hit in
the dcache.  For writes that miss in the cache I think the difference
will probably be lost in the noise since the memory bus bandwidth will
the the limiting factor.

p.


stmia and str+str is also a wash for the following PXA250 on Intel DBPXA26x development platform, so for now the patch doesn't seem make thing worse, but improve omap2430. Also the latest kernel arch/arm/lib/memset.S also uses stmia.

Processor       : XScale-PXA250 rev 4 (v5l)
BogoMIPS        : 198.65
Features        : swp half thumb fastmult edsp
CPU implementer : 0x69
CPU architecture: 5TE
CPU variant     : 0x0
CPU part        : 0x290
CPU revision    : 4
Cache type      : undefined 5
Cache clean     : undefined 5
Cache lockdown  : undefined 5
Cache unified   : Harvard
I size          : 32768
I assoc         : 32
I line length   : 32
I sets          : 32
D size          : 32768
D assoc         : 32
D line length   : 32
D sets          : 32
Hardware        : Intel DBPXA250 Development Platform


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]