This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
Gary Houston <ghouston@arglist.com> writes:
> I'm not sure I understand this proposal completely, since I don't see
> what you gain by using two ports.
No, two (rather four) port *types*.
> Wouldn't it be confusing to work
> with, e.g., if you were reading a stream of arbitrary data, would you
> read from one port some of the time to unpack bytes into Scheme and
> then from the other whenever you expected a character?
I don't think you can meaningfully or reliably do that. You either
process a sequence of bytes or you process a sequence of characters.
> It seems to me easier to consider an input port to be a source of
> bytes, with read-char a procedure for unpacking bytes into characters.
How about peek-char, read, read-line, etc? What about display, write,
format?
Basically, all standard Scheme procdures work with characters,
not bytes, so an input port *is* a sequence of characters.
You can add extra procedures that read the underlying bytes
but you will find that buffering and character conversion
make that problematical.
> To support multiple encodings, the port could have a "current
> encoding" which could be changed at will (actually this is just to
> avoid adding an extra incompatible argument to read-char.
Can't do that in general. Some encodings are "stateful". I guess
you can reset the decoding state when you switch encodings. If you
do that for output, you'll produce a meaningless document.
> An alternative would be to let read-char default to a global locale
> setting and add read-char/charset or something to specify variations.)
Yes, that is the C approach. It is of course the wrong way to do it.
(It doesn't work with threads - or clean programming practices.)
> Individual characters are only part of the problem anyway: there's
> also the custom of treating strings as byte arrays that would break.
Assuming the size of character remains at least 8 bits (i.e.
integer->char and char->integer are well defined for at least
the range 0 .. 255), I don't see where the breakage would come in.
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/