This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Use-new-strlen-implementation-in-rtld
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Richard Henderson <rth at twiddle dot net>
- Cc: libc-alpha at sourceware dot org, gcc at gcc dot gnu dot org
- Date: Fri, 1 Feb 2013 19:35:24 +0100
- Subject: Re: Use-new-strlen-implementation-in-rtld
- References: <20130131123714.GA29130@domone.kolej.mff.cuni.cz><510BF2E0.9070001@twiddle.net>
Crossposting to gcc.
On Fri, Feb 01, 2013 at 08:52:48AM -0800, Richard Henderson wrote:
> On 01/31/2013 04:37 AM, OndÅej BÃlka wrote:
> >To also use my implementation of strlen in runtime linker
> >use following patch.
> >
> >It uses fact that xmm are call clobbered and only xmm0-xmm7 can be
> >used to pass arguments so xmm8-xmm15 are available.
>
> FYI, on the gcc list, in the context of Cilk+, Intel have been talking
> about a new calling convention for "vector" functions that would in
> fact use all 16 sse registers for argument passing.
>
> So, please no.
>
And did they provide any example where it would lead to simpler code and
improve performance?
It would benefit only when function pass 9 and more floats/vectors
functions that need this are not performance critical.
A calling convention that would help it would keep arguments passed at
xmm0-7 but make xmm2-7 caller save. This could be specified by fastcall
attribute.
This would help quite often, there is a optimization rule not to call
any function when using vectors float callculations because
pushing/poping them on stack easily increases cost of call by 20 cycles.
>
> r~