This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize MIPS memcpy


n 31/10/2012, at 6:45 AM, Steve Ellcey wrote:

> On Tue, 2012-10-30 at 20:16 +1300, Maxim Kuvyrkov wrote:
>> 
...
>> I have tested your latest version.  Good news: there are no correctness issues.  Bad news: it underperforms compared to my patch by 2-3 times on both N32 and N64 (didn't test O32) on the benchmark that I used.  I've run the benchmark several times and results are consistent.  I use oprofile on libc.so to determine how much time is spent in memcpy.
>> 
>> Would you please confirm that your current implementation is faster on YOUR benchmark than my patch in http://sourceware.org/ml/libc-ports/2012-09/msg00000.html ?  Please make sure that PREFETCH macro in ports/sysdeps/mips/sys/asm.h gets defined to "pref", not "nop", in your build.
>> 
>> Thanks,
>> 
>> --
>> Maxim Kuvyrkov
>> CodeSourcery / Mentor Graphics
> 
> Maxim, With O32 ABI I am seeing my version as slightly faster for large
> memcpy's and slightly slower for small memcpy's compared to yours.
> 
> With N32 and 64 ABI's I see my version as slightly faster across the
> board (a couple of percentage points).  I am definitely not seeing
> anything like a 2X difference.  Are you sure prefetch is defined when
> you tested my version?  How about using double loads and stores?  They
> should both get set by default.

It turns out I was benchmarking my patch against original glibc implementation, not yours (patched files in ports/ instead of libc/ports).  With the patch applied correctly, the performance is virtually the same on my benchmark.  I've also checked the assembly dump of libc.so and confirmed that prefetch instructions and 8-byte loads/store are used where appropriate.

Given that your patch provides on par or better performance than mine, and it also unifies MIPS memcpy for all ABIs (as well as between glibc and Bionic!) -- I am all for your patch.

I've reviewed you patch -- code is clean and well-documented.  Please apply the patch if sufficient testing has been done: big- and little-endian for o32/n32/n64 ABIs.  I've tested your patch for all big-endian ABIs, so you just need to cover little-endian (which, I think, you may have done already).

Thanks for bearing with me through all the debugging process!

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]