This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Use-new-strlen-implementation-in-rtld

From: OndÅej BÃlka <neleai at seznam dot cz>
To: Richard Henderson <rth at twiddle dot net>
Cc: libc-alpha at sourceware dot org, gcc at gcc dot gnu dot org
Date: Fri, 1 Feb 2013 19:35:24 +0100
Subject: Re: Use-new-strlen-implementation-in-rtld
References: <20130131123714.GA29130@domone.kolej.mff.cuni.cz><510BF2E0.9070001@twiddle.net>

Crossposting to gcc.

On Fri, Feb 01, 2013 at 08:52:48AM -0800, Richard Henderson wrote:
> On 01/31/2013 04:37 AM, OndÅej BÃlka wrote:
> >To also use my implementation of strlen in runtime linker
> >use following patch.
> >
> >It uses fact that xmm are call clobbered and only xmm0-xmm7 can be
> >used to pass arguments so xmm8-xmm15 are available.
> 
> FYI, on the gcc list, in the context of Cilk+, Intel have been talking
> about a new calling convention for "vector" functions that would in
> fact use all 16 sse registers for argument passing.
> 
> So, please no.
> 
And did they provide any example where it would lead to simpler code and
improve performance?

It would benefit only when function pass 9 and more floats/vectors
functions that need this are not performance critical.

A calling convention that would help it would keep arguments passed at 
xmm0-7 but make xmm2-7 caller save. This could be specified by fastcall
attribute. 

This would help quite often, there is a optimization rule not to call 
any function when using vectors float callculations because
pushing/poping them on stack easily increases cost of call by 20 cycles.

> 
> r~

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]