compiling new-lib
Jens-Christian Lache
lache@tu-harburg.de
Tue Nov 28 01:58:00 GMT 2000
Am Die, 28 Nov 2000 schrieben Sie:
> Jens-Christian Lache wrote:
> >
> > Does anybody know if the files from new-lib get compiled
> > with optimization -O2 turned on by default?
> >
> > If I do a doublexdouble - multiplikation (64x64) on a ARM7TDMI proc,
> > I get times of 1160 Clock-Ticks per multiplikation.
> >
> > Jens-Christian
> >
> > --
> >
> > Jens-Christian Lache
> > Technische Universitaet Hamburg-Harburg
> > www.tu-harburg.de/~sejl1601
> > Mail:
> > lache@tu-harburg.de
> > lache@ngi.de
> > Tel.:
> > +0491759610756
>
> The default target flags are "-g -O2". This does not affect multiplication, however.
> Code for the multiplication of doubles is part of the compiler.
>
> What options did you specify when compiling? If your chip supports floating point, you
> want to specify the -mhard-float option when you compile and link. Linking via gcc allows
> you to specify this option so that the correct newlib library is linked in.
>
> You should also note that the default newlib math library uses integer math. If
> you want newlib to use floating-point algorithms, then configure with --enable-newlib-hw-fp.
>
> -- Jeff Johnston
--
Hello! I don't have a floating point unit. What I have is a 32x8 Multiplier
using the booth algor.
I have benchmarked 32x32 multiplikation before. The results were:
1.) no loop
(two Timer-Ticks correspond to one nop)
nb Timer-Ticks
of
mult
0 4 (Measure overhead)
1 24
3 47
6 87
16 242
20 304
256 3965
512 7929
1024 15868
each mult consists of
ldr r1, [fp, #-448]
ldr r3, [fp, #-88]
mla r9, r3, r1, r9
=> 15.5 Timer-Ticks/ (sum+=c[i]*samples[i])
2.) loop
If I put it in a loop, the assemble code is much larger:
.L15:
mov r3, r5, asl #2
ldr r1, [ip, r3]
add r5, r5, #1
ldr r2, [r0, r3]
cmp r5, #79
mla r7, r2, r1, r7
ble .L15
nb of Timer-Ticks
mult
0 4
1 26
8 204
80 2040
300 7648
600 15297
800 20397
1024 26109
=> 25.5 Timer-Ticks/loop
These results look quite allright to me. If I have double instead of int
variables inside my loop, the results look like:
0 26
1 2319
2 5034
4 10413
8 21048
16 41986
32 83948 (with overhead bit set in TC_SR)
These values look so strange to me. Shouldn't a 64x64 multi
have the size of four 32x32 mult? Or is this normal?
Jens-Christian
Jens-Christian Lache
Technische Universitaet Hamburg-Harburg
www.tu-harburg.de/~sejl1601
Mail:
lache@tu-harburg.de
lache@ngi.de
Tel.:
+0491759610756
More information about the Newlib
mailing list