This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize MIPS memcpy


On Mon, 15 Oct 2012, Steve Ellcey wrote:

> > On:
> > system type             : EBB6300 (CN6335p2.1-1500-AAP)
> > processor               : 0
> > cpu model               : Cavium Octeon II V0.9
> > 
> > I get:
> > ...
> > 0x200757cb, (no zeros)
> > 0x200757cc, (no zeros)
> > 0x200757cd, (no zeros)
> > 0x200757ce, (no zeros)
> > 0x200757cf, (0x20075780 to 0x20075800, 128 byte prefetch)
> > 
> > Thanks,
> > Andrew Pinski
> 
> Andrew,
> 
> Is there a macro I can/should use when building glibc/memcpy to know
> that it should assume a Cavium Octeon with 128 byte prefetch instead of
> the 32 byte prefetch?

 FWIW I don't think hardcoding the cache line size for individual 
processor types is going to scale, not even mentioning it may not serve 
its purpose at all given that the cache line size may be boot-mode or even 
run-time configurable in a vendor-specific way (some MTI cores for example 
use CP0.Config.WC for cache topology reconfiguration, although the 
currently available implementations do not seem to include the line sizes 
among the reconfigurable parameters).

 This looks to me like a case for multiple copies of memcpy binary code 
tuned for an individual cache line size each and then selected via the 
IFUNC feature -- there should be no run-time penalty for doing that in 
dynamic executables/libraries (except from libc itself perhaps) as the 
call is going to be made through the GOT anyway.  Of course the line size 
needs to be determined somehow at the first invocation -- perhaps the 
appropriate bits from CP0 Config1/2 registers could be exported by the 
kernel.

 If storage/memory footprint is of concern, then perhaps for -Os builds 
(is that supported for glibc these days anyway?) only a single copy of 
memcpy could be built.

 BTW, the M14Kc only has a 16-byte cache line size, so it will need 
another arrangement.

 Thoughts?

  Maciej


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]