This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))

To: Gary Houston <ghouston at arglist dot com>
Subject: Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
From: Per Bothner <per at bothner dot com>
Date: 16 Feb 2000 14:04:54 -0800
Cc: guile at sourceware dot cygnus dot com
References: <20000212132515.712.qmail@d231-122.dial.mistral.co.uk> <20000212171420.A1179@206.31.63.15> <20000214215419.833.qmail@d231-209.dial.mistral.co.uk> <20000215101916M.1000@eccosys.com> <m2900m2t33.fsf@kelso.bothner.com> <20000216212746.679.qmail@d231-87.dial.mistral.co.uk>

Gary Houston <ghouston@arglist.com> writes:

> I'm not sure I understand this proposal completely, since I don't see
> what you gain by using two ports.

No, two (rather four) port *types*.

> Wouldn't it be confusing to work
> with, e.g., if you were reading a stream of arbitrary data, would you
> read from one port some of the time to unpack bytes into Scheme and
> then from the other whenever you expected a character?

I don't think you can meaningfully or reliably do that.  You either
process a sequence of bytes or you process a sequence of characters.

> It seems to me easier to consider an input port to be a source of
> bytes, with read-char a procedure for unpacking bytes into characters.

How about peek-char, read, read-line, etc?  What about display, write,
format?

Basically, all standard Scheme procdures work with characters,
not bytes, so an input port *is* a sequence of characters.
You can add extra procedures that read the underlying bytes
but you will find that buffering and character conversion
make that problematical.

> To support multiple encodings, the port could have a "current
> encoding" which could be changed at will (actually this is just to
> avoid adding an extra incompatible argument to read-char.

Can't do that in general.  Some encodings are "stateful".  I guess
you can reset the decoding state when you switch encodings.  If you
do that for output, you'll produce a meaningless document.

> An alternative would be to let read-char default to a global locale
> setting and add read-char/charset or something to specify variations.)

Yes, that is the C approach.  It is of course the wrong way to do it.
(It doesn't work with threads - or clean programming practices.)

> Individual characters are only part of the problem anyway: there's
> also the custom of treating strings as byte arrays that would break.

Assuming the size of character remains at least 8 bits (i.e.
integer->char and char->integer are well defined for at least
the range 0 .. 255), I don't see where the breakage would come in.
-- 
	--Per Bothner
per@bothner.com   http://www.bothner.com/~per/

Follow-Ups:
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Gary Houston

References:
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: Gary Houston
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: C. Ray C.
- Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module)
  - From: Gary Houston
- binary-io, opposable-thumb, pack/unpack (was Re: binary-io (wasRe: rfc 2045 base64 encoding/decoding module))
  - From: sen_ml
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Per Bothner
- Re: binary-io, opposable-thumb, pack/unpack (was Re: binary-io (was Re: rfc 2045 base64 encoding/decoding module))
  - From: Gary Houston

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]