This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fixes tree-loop-distribute-patterns issues


On Thu, Jun 20, 2013 at 01:59:19PM -0700, Roland McGrath wrote:
> > Actually you should split simple_* to separate files and compile them with
> > O0.
> 
> __attribute__ ((optimize ("O0"))) is sufficient in compilers that support
> it (4.6, I think) and less hassle than breaking up files.  I don't think
> anyone does or should care about performance analysis using compilers that
> are so old as not to have that.
> 
> > Doing otherwise makes their performance dependent on gcc version and
> > this makes results even more unreliable.
> 
> Perhaps that matters for benchtests, if they are intended to use the
> simple_* implementations' performance as a baseline for comparison.  The
> correctness tests (i.e. all tests outside benchtests/) do not care about
> that, and that's all I'm personally concerned with.
> 
> If what you want as a performance baseline is "the obvious loop handling a
> byte at a time", then -O0 code can easily be substantially worse than this
> and give a misleading impression of what naive code would actually do.
> With -O0, the compiler is exceedingly stupid (by design), and usually every
> operation has excess spill and reload operations, which could easily
> dominate the performance of what would otherwise be a very tight loop.
> Short of hand-coding naive assembly for each machine, I'm not sure how you
> can robustly address that issue.  Perhaps -O1 is a good fit for what
> assembly a human would write when not trying to be especially clever;
> but that's just a shot in the dark.
>
I choose a O0 as lesser evil than having reference implementation twice
faster depending what compiler you do use.

One solution is mandate to run benchmarks with fixed version of gcc and
fixed flags.

Second variant could be have assemblies and regeneration script that would 
be ran with specific gcc.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]