This is the mail archive of the
newlib@sources.redhat.com
mailing list for the newlib project.
Re: compiling new-lib
Jens-Christian Lache wrote:
>
> Am Die, 28 Nov 2000 schrieben Sie:
> > Jens-Christian Lache wrote:
> > >
> > > Does anybody know if the files from new-lib get compiled
> > > with optimization -O2 turned on by default?
> > >
> > > If I do a doublexdouble - multiplikation (64x64) on a ARM7TDMI proc,
> > > I get times of 1160 Clock-Ticks per multiplikation.
> > >
> > > Jens-Christian
> > >
> > > --
> > >
> > > Jens-Christian Lache
> > > Technische Universitaet Hamburg-Harburg
> > > www.tu-harburg.de/~sejl1601
> > > Mail:
> > > lache@tu-harburg.de
> > > lache@ngi.de
> > > Tel.:
> > > +0491759610756
> >
> > The default target flags are "-g -O2". This does not affect multiplication, however.
> > Code for the multiplication of doubles is part of the compiler.
> >
> > What options did you specify when compiling? If your chip supports floating point, you
> > want to specify the -mhard-float option when you compile and link. Linking via gcc allows
> > you to specify this option so that the correct newlib library is linked in.
> >
> > You should also note that the default newlib math library uses integer math. If
> > you want newlib to use floating-point algorithms, then configure with --enable-newlib-hw-fp.
> >
> > -- Jeff Johnston
> --
>
> Hello! I don't have a floating point unit. What I have is a 32x8 Multiplier
> using the booth algor.
>
> I have benchmarked 32x32 multiplikation before. The results were:
> 1.) no loop
> (two Timer-Ticks correspond to one nop)
> nb Timer-Ticks
> of
> mult
> 0 4 (Measure overhead)
> 1 24
> 3 47
> 6 87
> 16 242
> 20 304
> 256 3965
> 512 7929
> 1024 15868
> each mult consists of
> ldr r1, [fp, #-448]
> ldr r3, [fp, #-88]
> mla r9, r3, r1, r9
> => 15.5 Timer-Ticks/ (sum+=c[i]*samples[i])
> 2.) loop
> If I put it in a loop, the assemble code is much larger:
> .L15:
> mov r3, r5, asl #2
> ldr r1, [ip, r3]
> add r5, r5, #1
> ldr r2, [r0, r3]
> cmp r5, #79
> mla r7, r2, r1, r7
> ble .L15
> nb of Timer-Ticks
> mult
> 0 4
> 1 26
> 8 204
> 80 2040
> 300 7648
> 600 15297
> 800 20397
> 1024 26109
> => 25.5 Timer-Ticks/loop
>
> These results look quite allright to me. If I have double instead of int
> variables inside my loop, the results look like:
> 0 26
> 1 2319
> 2 5034
> 4 10413
> 8 21048
> 16 41986
> 32 83948 (with overhead bit set in TC_SR)
>
> These values look so strange to me. Shouldn't a 64x64 multi
> have the size of four 32x32 mult? Or is this normal?
>
No, the compiler uses IEEE floating point format for double. IEEE floating point
supports a wide range of numbers including Infinity and NaNs (Not a Number).
Performing multiplication of two IEEE floating point numbers is not a trivial
operation and on a platform that does not have floating-point hardware instructions
it becomes a relatively expensive operation to simulate.
If performance is crucial to your application then you will have to switch to
use fixed-point math. Fixed point math represents fractional values using
integers and uses integral operations. A multiply of two fixed-point values
uses an integer multiply and a shift. You should be able to find ample information on the net
regarding fixed-point math. The following is just one of many tutorials on the subject.
http://www.wwnet.net/~stevelim/fixed.html
If you have any other performance questions regarding the compiled code,
then I suggest you post your questions to the gcc project mailing list.
-- Jeff J.