This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: Probem with join and accentuated characters
- From: Eric Blake <ebb9 at byu dot net>
- To: Boris New <boris dot new at gmail dot com>
- Cc: cygwin at cygwin dot com
- Date: Thu, 31 Mar 2005 06:55:17 -0700
- Subject: Re: Probem with join and accentuated characters
- References: <cf47199205033104307d2a4a84@mail.gmail.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Boris New on 3/31/2005 5:30 AM:
> Hi,
>
> Join in coreutils 5.3.03 gives incomplete results when the two files
> include french accentuated characters. (for instance
> é|è|â|ï|ü|ê|ç|î|ô|û|ü|ë|à|ù) .
> Results are okay when I have only one text file with accentuated characters.
I'll need more details on what you think is broken (hint - two actual
short files that you tried to join, and the results you got vs what you
expected). Also, coreutils-5.3.0-3 join is unmodified from upstream
sources, so you may want to ask this question on the upstream list
(bug-coreutils@gnu.org). But it may have something to do with file
encodings; if your two inputs have different encodings, accented
characters don't necessarily have the same underlying bytes, and that
might mess up join. Also, join requires both files to be sorted on the
join fields, and if they are not, there is no telling what results to expect.
- --
Life is short - so eat dessert first!
Eric Blake ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCTAFF84KuGfSFAYARAjX0AKCy83MHsdGJFx0kvsexYBPV6CnR2QCgrbfV
IVM1USqaQS3U8bdr1vV0Kck=
=D383
-----END PGP SIGNATURE-----
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/