This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: optimized libm single precision routines: erfcf, erff,expf for x86_64.


On Thu, Feb 16, 2012 at 2:00 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 02/16/2012 12:11 PM, Dmitrieva Liubov wrote:
>> + ? ? movss ? %xmm0, -16(%rsp) ? ? ? ?/* save SP x*K/log(2)+RS */
>> + ? ? movss ? -16(%rsp), %xmm1 ? ? ? ?/* load SP x*K/log(2)+RS */
>
> What's up with these sorts of obvious compiler-generated bits of silliness?
>
> You stated that you do not plan to provide the C source because you "believe
> that the assembly should be faster." ?Given turds like the above, I do not
> accept this assertion without proof.
>
> Given this routine does all scalar code, I don't see why it might not be
> faster for all of the other targets as well.
>
>

These codes do look bad:

+	cvtsd2ss	%xmm0, %xmm0	/* SP x*K/log(2)+RS */
+	movss	%xmm0, -16(%rsp)	/* save SP x*K/log(2)+RS */
+	movss	-16(%rsp), %xmm1	/* load SP x*K/log(2)+RS */

They can be replaced by

cvtsd2ss	%xmm0, %xmm1

Also do we need to do it like:

+	movss	%xmm0, -8(%rsp)		/* Save argument in current frame */

I think you simply remove it and do

	/* Here if 2^(-28)<=|x|<125*log(2) */
	cvtss2sd	%xmm0, %xmm3	/* Load x converted to double precision */

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]