Kawa: Compiling Scheme to Java
Prev		Next

Objects and Values

Java [JavaSpec] has primitive types (such as 32-bit int) as well reference types. If a variable has a reference type, it means that it can contain references (essentially pointers) to objects of a class, or it can contain references to objects of classes that “extend” (inherit from) the named class. The inheritance graph is “rooted” (like Smalltalk and unlike C++); this means that all classes inherit from a distinguished class java.lang.Object (or just Object for short).

Standard Scheme [R5RS] has a fixed set of types, with no way of creating new types. It has run-time typing, which means that types are not declared, and a variable can contain values of different types at different times. The most natural type of a Java variable that can contain any Scheme value is therefore Object, and all Scheme values must be implemented using some class that inherits from Object.

The task then is to map each Scheme type into a Java class. Whether to use a standard Java class, or to write our own is a tradeoff. Using standard Java classes simplifies the passing of values between Scheme functions and existing Java methods. On the other hand, even when Java has suitable built-in classes, they usually lack functionality needed for Scheme, or are not organized in any kind of class hierarchy as in Smalltalk or Dylan. Since Java lacks standard classes corresponding to pairs, symbols, or procedures, we have to write some new classes, so we might as well write new classes whenever the existing classes lack functionality.

The Scheme boolean type is one where we use a standard Java type, in this case Boolean (strictly speaking java.lang.Boolean). The Scheme constants #f and #t are mapped into static fields (i.e. constants) Boolean.FALSE and Boolean.TRUE.

On the other hand, numbers and collections are reasonably organized into class hierarchies, which Java does not do well. So Kawa has its own classes for those. The next sections will give skeletal definitions of the classes used to to represent Scheme values.

Collections

Kawa has a rudimentary hierarchy of collection classes.

class Sequence
{ ...;
  abstract public int length();
  abstract public Object elementAt(int i);
}

A Sequence is the abstract class that includes lists, vectors, and strings.

class FString extends Sequence
{ ...;
  char[] value;
}

Used to implement fixed-length mutable strings (array of Unicode character). This is used to represent Scheme strings.

class FVector extends Sequence
{ ...;
  Object[] value;
}

Used to implement fixed-length mutable general one-dimensional array of Object. This is used to represent Scheme vectors.

public class List extends Sequencw
{ ...;
  protected List () { }
  static public List Empty = new List ();
}

Used to represent Scheme (linked) lists. The empty list '() is the special static value List.Empty. Non-empty-lists are implemented using Pair objects.

public class Pair extends List
{ ...;
  public Object car;
  public Object cdr;
}

Used for Scheme pairs, i.e. all non-empty lists.

public class PairWithPosition extends Pair
{ ...;
}

Like Pair, but includes the filename and linenumber in the file from which the pair was read.

Future plans include more interesting collection classes, such a sequences implemented as a seekable disk file; lazily evaluated sequences; hash tables; APL-style multi-dimensional arrays; stretchy buffers. (Many of these ideas were implemented in my earlier experimental language Q -- see [Bothner88] and ftp://ftp.cygnus.com/pub/bothner/Q/. I will also integrate the Kawa collections into the new JDK 1.2 collections framework.

Top-level environments

class Environment
{ ...;
}

An Environment is a mapping from symbols to bindings. It is used for the bindings of the user top-level. There can be multiple top-level Environments, and an Environment can be defined as an extension of an existing Environment. The latter feature is used to implement the various standard environment arguments that can be passed to eval, as adopted for the latest Scheme standard revision, R5RS. Nested environments were also implemented to support threads, and fluid bindings (even in the presence of threads).

Environments will be integrated into a more general name-table interface, which will also include records and ECMAScript objects.

Symbols

Symbols represent identifiers, and do not need much functionality. Scheme needs to be able to convert them to and from Scheme strings, and they need to be “interned” (which means that there is a global table to ensure that there is a unique symbol for a given identifier). Symbols are immutable and have no accessible internal structure.

Scheme symbols are reprented using interned Java Strings. Note that the Java String class implements immutable strings, and is therefore cannot be used to implement Scheme strings. However, it makes sense to use it to implement symbols, since the way Scheme symbols are used is very similar to how Java Strings are used. The method intern in String provides an interned version of a String, which provides the characters-to-String mapping needed for Scheme strings.

Prev	Home	Next
Basic implementation strategy		Numbers