This is the mail archive of the libc-alpha@sourceware.cygnus.com mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

problem with ISO-2022-KR encoder

To: libc-alpha at sourceware dot cygnus dot com
Subject: problem with ISO-2022-KR encoder
From: Bruno Haible <haible at ilog dot fr>
Date: Mon, 20 Dec 1999 16:54:56 +0100 (MET)

Hello,

The glibc-2.1.1 iconv ISO-2022-KR encoder puts an "Esc $ ) C" sequence
only once, at the beginning of its output, not in every line.

Ken Lunde's CJK.INF says the SO designator needs to appear only once at the
beginning of a text (rationale: because ISO-2022-KR uses only one two-byte
character set), but RFC 1557 says it must appear once in every line
containing SO characters (rationale: so that if some lines of the text get
lost, the remaining are still recognizable as Korean).

glibc-2.1.1 iconv doesn't implement this RFC 1557 requirement:

$ iconv -f UTF-8 -t ISO-2022-KR < KSC5601-snippet.utf-8 > x

Here is a hexdump of the output:
000000  1B 24 29 43 4B 6F 72 65 61 6E 20 28 0E 47 51 31  .$)CKorean (.GQ1
000010  5B 0F 29 09 09 09 0E 3E 48 33 67 47 4F 3C 3C 3F  [.)....>H3gGO<<?
000020  64 0F 2C 20 0E 3E 48 33 67 47 4F 3D 4A 34 4F 31  d., .>H3gGO=J4O1
000030  6E 0F 0A 09 4B 53 43 20 20 2D 2D 20 0E 6A 2A 51  n...KSC  -- .j*Q
              ^^                            ^^
000040  28 0F 20 20 0E 4B 52 5B 21 0F 0A                 (.  .KR[!..


          Bruno

Follow-Ups:
- Re: problem with ISO-2022-KR encoder
  - From: Ulrich Drepper

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]