This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Support inline SSE/SSE2


On Mon, Nov 24, 2003 at 04:00:33PM -0800, Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> H. J. Lu wrote:
> > SSE/SSE2 has float/double sqrt instructions. This patch uses them if
> > SSE/SSE2 is enabled.
> 
> What are the advantages?  Show some code we can time.
> 

It turns out gcc 3.3 and above can inline sqrt for SSE2 quite well. 
This is what I got with gcc 3.3-redhat on a 3GHz P4:

./sse2.sse2
Float: stop TSC (1777098976323730) - start TSC (1777098507396722): 468927008
Double: stop TSC (1777099649963698) - start TSC (1777098976323730): 673639968
Long double: stop TSC (1777101100976106) - start TSC (1777099649963698):
1451012408
./sse2.x87
Float: stop TSC (1777101880069514) - start TSC (1777101102742358): 777327156
Double: stop TSC (1777102598562830) - start TSC (1777101880069514): 718493316
Long double: stop TSC (1777104226940382) - start TSC (1777102598562830):
1628377552
./x87.sse2
Float: stop TSC (1777104871565354) - start TSC (1777104228718874): 642846480
Double: stop TSC (1777105263228530) - start TSC (1777104871565354): 391663176
Long double: stop TSC (1777106161090282) - start TSC (1777105263228530):
897861752
./x87.x87
Float: stop TSC (1777106835277334) - start TSC (1777106163172098): 672105236
Double: stop TSC (1777107221980882) - start TSC (1777106835277334): 386703548
Long double: stop TSC (1777108143253162) - start TSC (1777107221980882):
921272280


H.J.
----
2003-11-25  H.J. Lu  <hongjiu.lu@intel.com>

	* sysdeps/i386/fpu/bits/mathinline.h (sqrt): Don't inline
	sqrt for gcc 3.3 and above if SSE2 is enabled

--- bits/mathinline.h.orig	2003-11-24 16:13:44.000000000 -0800
+++ bits/mathinline.h	2003-11-25 12:05:14.000000000 -0800
@@ -439,8 +439,10 @@ __inline_mathcodeNP2 (fmod, __x, __y, \
 
 
 #ifdef __FAST_MATH__
+# if !__GNUC_PREREQ (3,3) || !defined __SSE2__
 __inline_mathopNP (sqrt, "fsqrt")
 __inline_mathopNP_ (long double, __sqrtl, "fsqrt")
+# endif
 #endif
 
 #if __GNUC_PREREQ (2, 8)

Attachment: test.tar.gz
Description: GNU Zip compressed data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]