This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: mbstrings


NIIBE Yutaka <gniibe@etl.go.jp> writes:
> but please note that there are many cases to unify/cannonicalize
> characters especially for Chinese characters.

Yes, comparing characters (and sorting strings) is definitely
context-dependent.  The Unicode 2.0 book discusses a number of the
issues.  Using Unicode does not make the problems do away.

The question is:  What should the Scheme primitives eq? and char=?
do?  And what should char->integer and integer->char do?  I think
the natural expectation is that these should all be trivial functions.
char->integer and integer->char are naturally assumed to convert
between characters and their internal character code.  And (char=? x y)
is assumed to be the same as (= (char->integer x) (char->integer y)).

Unicode naturally fits into this picture;  Mule does not.

> We should support unification/cannonicalization, but the way of
> unification should not be only one.  It's better to support multiple
> ways to unify characters.  It depends on what users want.

This is an application issue, not a Scheme issue per se.
Note that an application string is not necessarily a Scheme
string object, since in an application you may also be concerned
about fonts, color, language, and other attributes.  Comparing
application strings does not necessarily directly map into comparing
Scheme strings.

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner