This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication

From: Siddhesh Poyarekar <siddhesh at redhat dot com>
To: munroesj at us dot ibm dot com
Cc: libc-alpha at sourceware dot org
Date: Thu, 3 Jan 2013 22:14:56 +0530
Subject: Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication
References: <20121231092850.GA21621@spoyarek.pnq.redhat.com><1357158013.19573.64.camel@spokane1.rchland.ibm.com><20130103033814.GA5345@spoyarek.pnq.redhat.com><1357229888.19573.84.camel@spokane1.rchland.ibm.com>

On Thu, Jan 03, 2013 at 10:18:08AM -0600, Steven Munroe wrote:
> This is very bad for POWER. PowerPC has (multiple) independent fixed
> point and floating point pipelines. This allow super-scalar out-of-order
> execution, UNTIL you force a transfer (through memory) between the
> FPRs/GPRs. PowerPC has lots of registers (32+32+32), we expect the
> compiler to keep lots of data in the registers, and so we don't optimize
> the hardware for dependent load after store, we optimize for memory
> bandwidth.
> 
> You proposed code forces an (unnecessary) double->long conversion and
> FPR to GPR transfer into the inner loop, disabling any super-scalar
> parallel execution. It also prevents loop unrolling and does not allow
> GCC to make good use of all those registers we provide in the
> architecture.
> 
> So your code is optimized for (register poor, in-order-execution) X86 at
> the expense of PowerPC.
> 

I'm confused, which patch are you talking about, the current loop
split patch or the conversion of mantissa to int or some other patch?
I'll summarize the patches that are currently under review:

1) Conversion of mantissa of mp_no to int.  This provides scope to
   convert all mp operations to scalar.  There are no conversions from
   double to long or backwards except when constructing an mp_no or
   deconstructing it to double.  This patch is now stale and I need to
   work on a new revision, especially in the light of the custom
   powerpc code.

2) Fix build failure on power4 or later.  This is just consolidation
   of the declaration of globals and constant values.  This should
   have no impact on pipelining performance.

3) Splitting the multiplication loop (the current patch which you've
   commented on).  It does not affect powerpc code at all since
   powerpc has a custom implementation of this loop.

Siddhesh

Follow-Ups:
- Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication
  - From: Steven Munroe

References:
- Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication
  - From: Steven Munroe
- Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication
  - From: Siddhesh Poyarekar
- Re: [PATCH] Split mantissa calculation loop and add branchprediction to mp multiplication
  - From: Steven Munroe

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]