This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fixes tree-loop-distribute-patterns issues


On Fri, Jun 21, 2013 at 01:06:07AM +0000, Joseph S. Myers wrote:
> On Fri, 21 Jun 2013, Ondrej Bilka wrote:
> 
> > > I expect -O0 performance to depend a lot more on GCC version than -O2.
> > > 
> > You expect but could you prove it? Please provide two versions of gcc
> > where you get different simple-* function when compiling with -O0 -S
> > 
> > Versions I checked are
> >  Debian 4.5.3-12
> >  Debian 4.7.1-2
> > gcc version 4.9.0 20130516 (experimental) (GCC)
> > 
> > Assemblies produced are same for following fragment:
> > 
> > void 
> > *memset (char *s, int c, int n)
> > {
> >   int i;
> >   for(i=0 ;i<n; i++) s[i] = c;
> >   return s;
> > }
> 
> I tried 4.3 and 4.4 based compilers building for i586 and got differences:
> 
> <       movl    %eax, %edx
> <       addl    8(%ebp), %edx
> <       movl    12(%ebp), %eax
> <       movb    %al, (%edx)
> ---
> >       addl    8(%ebp), %eax
> >       movl    12(%ebp), %edx
> >       movb    %dl, (%eax)
> 
> The general principle is simple enough: -O0 code is more likely to depend 
> on the fine details of the implementation, because differences in the 
> internal representation of no semantic significance can easily result in 
> changes to the generated code when a dumb conversion from IR to assembly 
> is in operation, whereas with -O2 such non-semantic differences are likely 
> to be optimized away.  And for such simple functions there's only a 
> limited amount an optimizer can do so different compiler versions are 
> likely to differ only in insubstantial matters of instruction ordering and 
> register allocation.
> 
Are you sure? Lower optimization levels keep a structure of program
mostly intact so a single change is unlikely to have big impact on
performance. If this is so then combination is likely to produce just a
noise.

On O2 you have much optimizations enabled so room for change is bigger.
That it is simple function is a argument againist O2 as a single
optimization can have big impact. 
>From my head unrolling will make big difference. If someone makes sane
heuristic for unroller to be enabled on O2 then swing will be big. Also
I heard that it is planned to enable vectorizer at O2 for obvious cases
which is also big. 

> Any sort of performance measurement involving -O0 is extremely suspect, 
> simply because performance is essentially not a consideration at all for 
> -O0 code generation; other matters such as speed of the compiler itself 
> and debuggability are the considerations involved, and are the things 
> people may try to avoid regressing across compiler upgrades.
> 
Here we need it mainly reference, as in this case it is more important than
actual performance.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]