This is the mail archive of the newlib@sourceware.cygnus.com mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: memcpy performance (fwd)



I thought I would pass this on.  Does the new version of memcpy do much
better than this?

---------- Forwarded message ----------
Date: Tue,  9 Dec 97 12:03:28 -0600
From: Eric Norum <eric@skatter.USask.Ca>
To: rtems-list@oarcorp.com
Subject: Re: memcpy performance

It's even worse than just a byte-by-byte copy!

On the 971024 snapshot (gen68360 BSP) a call to memcpy produces:
	1) A call to bcopy
	2) The bcopy routine links a stack frame and calls memmove
	3) The memmove routine:
		a) links a stack frame
		b) checks for overlap
		c) does a byte-by-byte copy
		   5 instructions/byte on a CPU32 processor!
		
There's a heck a of a lot of unnecessary code here:
	Two extra function calls
	Two extra stack frames
	Extra code to check for overlap
	A very inefficient loop

Processor-independent improvements required:
	1) There should be an explicit memcpy routine.
	2) The library should be compiled with aggressive optimization.
	
Processor-dependent improvements that would be nice:	
M68k - The loop in memmove should be done in such a way that  
processors like the CPU32 can go into loop mode.

Now all we need is a willing volunteer......

---
Eric Norum                                 eric@skatter.usask.ca
Saskatchewan Accelerator Laboratory        Phone: (306) 966-6308
University of Saskatchewan                 FAX:   (306) 966-6058
Saskatoon, Canada.