This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Proposal for a Guile binary file format


Marius Vollmer <mvo@zagadka.ping.de> writes:

> Mikael Djurfeldt <mdj@thalamus.nada.kth.se> writes:
> 
> > It is possible to introduce 4-word cells so that doubles can be
> > stored on the heap instead.  I expect such a change to bring Guile
> > floating point arithmetic up to approximately the same speed as its
> > integer arithmetic.
> 
> Maybe it would be easier to just have a quicklist for doubles.  [ I
> don't know if `quicklist' is the correct term or if that term is
> established.  Errm.  It's just a the usual linked stack of memory
> blocks that are all the same size ans where you allocate and free by
> popping and pushing the stack. ]

Sure.  That would speed up things and would be easier to implement.
But, 4-word cells would give many new opportunities in the type
system, and would really be a big advantage in more ways than this.

Besides, I think 4-word cells have been implemented already, both by
Michael Livshin and Greg Harvey.

> > Here's a proposal for a Guile binary format which would speed up
> > that part.
> > 
> > It can (I hope) be used at two levels:
> > 
> > 1. Storing the internal representation for Guile source.
> > 2. Storing memoized source.
> 
> This reminds me of the freezer.

Yes, I should have made a reference to the freezer.  I thought that
maybe a binary format would be more portable (not for transportation
but in the sense that it could be compiled and used on more
architectures, even those which lack dynamic linking support).

It would also be more independent of the evaluator, at least for level
1, and it is actually quite easy to implement.  The binary compiler is
not much more than a copy-tree using alternative versions of cons and
constructors for those malloc objects which have a read syntax.  The
load is a bunch of for loops.

[...]

> If I remember right, the freezer only showed a significant speed up
> when I went to the trouble of freezing only memoized source code.  For
> that, the freezer contained its own very special version of the
> evaluater that didn't really evaluate, it only memoized.  Reading and
> consing the source into the heap is probably not very time-consuming,
> but memoizing might be.
> 
> Given the legendary tightness of the evaluator, it might be too
> involved to keep the memoizer in sync.  Maybe the evaluator can be
> extended to be a memoizer, optionally.

I think the right way to go is to introduce a new abstraction between
the evaluator and the low-level macros.  There are so many cases where
you need to do low-level macro expansion and memoization in a
controlled way---for example when compiling methods in GOOPS.

Also, remember the current nasty dependence between any change in
low-level macros and scm_unmemocopy (the unmemoizer).

A low-level macro interface could contain entries both for
expansion/memoization and unmemoization, and some means of traversing
component expressions.

This interface would be used by the evaluator, the procedure-source
primitive, the debugging facilities, the GOOPS method compiler, and
the freezer.

> Another, longer road would be to have some kind of bytecode format and
> compile to that.  I know that Aubrey says his tree code evaluator is
> actually faster thgan a bytecode one, and I think he might just be
> right.  On the other hand, it might be that he is only saying that
> treecodes are faster when you profile both the compilation *and*
> execution phases.  I find it hard to believe that executing bytecodes
> must be slower tham executing treecodes.  (Maybe we shouldn't really
> be thinking "byte"codes here, but rather linearily arranged wordcodes
> or something.)  This would also allow us to take the `constness
> requirement' into account.

MzScheme is probably a good existence proof that Aubrey is wrong.
MzScheme is faster than scm.  (But it is the only one!)

> I would pay the price of slower loading of uncompiled Scheme source
> when we would instead have a good compiler that produces binary code
> that can be loaded fast and executes fast.

Me too.  This is surely the best way.  It's just that I wouldn't like
to wait for another three years before scwm starts quickly on my
computer.

> I could imagine that the number of patches would be quite large, on
> the order of the number of cons cells in the heap blocks.  Would it be
> possible to have `implicit' patches in the cons cells themselves?

I guess at least the pointers in the heap blocks could be detected
implicitly (easily).  However, at the same time that one introduces
more kinds of implicit patches, the complexity of both the compilation
process and the loading process will increase, and more dependencies
on the evaluator will be introduced.  The load process which I
described could be very efficient since it's not much more than a
linear sweep through the memory.

> Ok, to summarize, my gut feeling is that a binary format for
> unmemoized source code might not give us a significant speedup, but
> I'm not sure.

I'd say it looks like an interesting hack which could be tested.

But, probably, the first thing to do is to apply Jost's patches.

Maybe we could just apply them now and let people send in patches when
bugs are discovered.  It surely seems like a better alternative than
to wait yet another few months.

Then, if an abstraction for the memoization was introduced, maybe the
freezer solution is an attractive until we would get a real compiler?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]