This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] win-iconv


Fedora's MinGW toolchain has used win-iconv[1] instead of GNU libiconv
since F16.  win-iconv uses the Win32 APIs to do the heavy lifting,
making it almost 50 times smaller than GNU libiconv.  It handles most of
the same codesets as libiconv and even some the latter does not
(including the glibc aliases UTF8/UTF16 for UTF-8/UTF-16 etc.).  I have
attached a normalized diff between `iconv -l` from libiconv and current
win-iconv.  OTOH, it doesn't provide all of the API extensions of GNU
libiconv, but neither does glibc.

We could even go one step further and incorporate win-iconv (which is
Public Domain) directly into Cygwin itself, which would provide greater
compatibility with glibc and save patches to several packages which
don't use portable ways (e.g. AM_ICONV) for searching for iconv(3), or
use $(LIBICONV) when they should use $(LTLIBICONV), etc.  This would add
about 18KB to cygwin1.dll but save almost 1MB of memory and ImageBase
space for hundreds of packages.

Perhaps the tradeoffs aren't worth it in the end, but I thought it at
least warranted a discussion.  (And if it's not, at least I can stop
pondering this.)


Yaakov

[1] http://code.google.com/p/win-iconv/

--- LIBICONV_L	2012-11-21 19:44:10.522991600 -0600
+++ WIN_ICONV_L	2012-11-21 19:44:35.654429100 -0600
@@ -15,2 +14,0 @@
-arabic
-armscii-8
@@ -19,4 +16,0 @@
-atari
-atarist
-big-5
-big-five
@@ -24 +17,0 @@
-big5-2003
@@ -26,4 +18,0 @@
-big5-hkscs:1999
-big5-hkscs:2001
-big5-hkscs:2004
-big5-hkscs:2008
@@ -31,7 +19,0 @@
-bigfive
-c99
-chinese
-cn
-cn-big5
-cn-gb
-cn-gb-isoir165
@@ -40,2 +22 @@
-cp1046
-cp1124
+cp1025
@@ -43,2 +23,0 @@
-cp1129
-cp1131
@@ -46,3 +25,4 @@
-cp1161
-cp1162
-cp1163
+cp1200
+cp12000
+cp12001
+cp1201
@@ -61,0 +42,3 @@
+cp50221
+cp51932
+cp65001
@@ -69 +51,0 @@
-cp856
@@ -81 +63 @@
-cp922
+cp875
@@ -84 +65,0 @@
-cp943
@@ -88,10 +68,0 @@
-csbig5
-cseuckr
-cseucpkdfmtjapanese
-cseuctw
-csgb2312
-cshalfwidthkatakana
-cshproman8
-csibm1161
-csibm1162
-csibm1163
@@ -107,3 +77,0 @@
-csiso14jisc6220ro
-csiso159jisx02121990
-csiso2022cn
@@ -111,5 +78,0 @@
-csiso2022jp2
-csiso2022kr
-csiso57gb1988
-csiso58gb231280
-csiso87jisx0208
@@ -117,13 +79,0 @@
-csisolatin2
-csisolatin3
-csisolatin4
-csisolatin5
-csisolatin6
-csisolatinarabic
-csisolatincyrillic
-csisolatingreek
-csisolatinhebrew
-cskoi8r
-csksc56011987
-cskz1048
-csmacintosh
@@ -136,7 +86 @@
-csshiftjis
-csucs4
-csunicode
-csunicode11
-csunicode11utf7
-csviscii
-cyrillic
+cswindows31j
@@ -144,5 +88,2 @@
-dec-hanyu
-dec-kanji
-ecma-114
-ecma-118
-elot_928
+dos-720
+dos-862
@@ -150,2 +90,0 @@
-euc-jis-2004
-euc-jisx0213
@@ -154,6 +92,0 @@
-euc-tw
-euccn
-eucjp
-euckr
-euctw
-extended_unix_code_packed_format_for_japanese
@@ -162,2 +94,0 @@
-gb_1988-80
-gb_2312-80
@@ -165,7 +95,0 @@
-georgian-academy
-georgian-ps
-greek
-greek8
-hebrew
-hp-roman8
-hz
@@ -173,3 +96,0 @@
-ibm-1161
-ibm-1162
-ibm-1163
@@ -177,3 +98,24 @@
-ibm1161
-ibm1162
-ibm1163
+ibm-thai
+ibm00858
+ibm00924
+ibm01047
+ibm01140
+ibm01141
+ibm01142
+ibm01143
+ibm01144
+ibm01145
+ibm01146
+ibm01147
+ibm01148
+ibm01149
+ibm037
+ibm1026
+ibm273
+ibm277
+ibm278
+ibm280
+ibm284
+ibm285
+ibm290
+ibm297
@@ -180,0 +123,3 @@
+ibm420
+ibm423
+ibm424
@@ -181,0 +127,2 @@
+ibm500
+ibm737
@@ -196,4 +143,4 @@
-iso-10646-ucs-2
-iso-10646-ucs-4
-iso-2022-cn
-iso-2022-cn-ext
+ibm870
+ibm871
+ibm880
+ibm905
@@ -201,4 +148 @@
-iso-2022-jp-1
-iso-2022-jp-2
-iso-2022-jp-2004
-iso-2022-jp-3
+iso-2022-jp-ms
@@ -207,2 +150,0 @@
-iso-8859-10
-iso-8859-11
@@ -210 +151,0 @@
-iso-8859-14
@@ -212 +152,0 @@
-iso-8859-16
@@ -219,0 +160 @@
+iso-8859-8-i
@@ -221 +161,0 @@
-iso-celtic
@@ -223,21 +162,0 @@
-iso-ir-101
-iso-ir-109
-iso-ir-110
-iso-ir-126
-iso-ir-127
-iso-ir-138
-iso-ir-14
-iso-ir-144
-iso-ir-148
-iso-ir-149
-iso-ir-157
-iso-ir-159
-iso-ir-165
-iso-ir-166
-iso-ir-179
-iso-ir-199
-iso-ir-203
-iso-ir-226
-iso-ir-230
-iso-ir-57
-iso-ir-58
@@ -245,3 +164,3 @@
-iso-ir-87
-iso646-cn
-iso646-jp
+iso2022-jp
+iso2022-jp-ms
+iso2022-kr
@@ -250,2 +168,0 @@
-iso8859-10
-iso8859-11
@@ -253 +169,0 @@
-iso8859-14
@@ -255 +170,0 @@
-iso8859-16
@@ -262,0 +178 @@
+iso8859-8-i
@@ -266,10 +181,0 @@
-iso_8859-10
-iso_8859-10:1992
-iso_8859-11
-iso_8859-13
-iso_8859-14
-iso_8859-14:1998
-iso_8859-15
-iso_8859-15:1998
-iso_8859-16
-iso_8859-16:2001
@@ -277,29 +182,0 @@
-iso_8859-2
-iso_8859-2:1987
-iso_8859-3
-iso_8859-3:1988
-iso_8859-4
-iso_8859-4:1988
-iso_8859-5
-iso_8859-5:1988
-iso_8859-6
-iso_8859-6:1987
-iso_8859-7
-iso_8859-7:1987
-iso_8859-7:2003
-iso_8859-8
-iso_8859-8:1988
-iso_8859-9
-iso_8859-9:1989
-java
-jis0208
-jis_c6220-1969-ro
-jis_c6226-1983
-jis_x0201
-jis_x0208
-jis_x0208-1983
-jis_x0208-1990
-jis_x0212
-jis_x0212-1990
-jis_x0212.1990-0
-jisx0201-1976
@@ -307 +183,0 @@
-jp
@@ -309,2 +184,0 @@
-koi8-ru
-koi8-t
@@ -312 +185,0 @@
-korean
@@ -314,3 +186,0 @@
-ks_c_5601-1989
-ksc_5601
-kz-1048
@@ -318,9 +187,0 @@
-l10
-l2
-l3
-l4
-l5
-l6
-l7
-l8
-latin-9
@@ -328,16 +188,0 @@
-latin10
-latin2
-latin3
-latin4
-latin5
-latin6
-latin7
-latin8
-mac
-macarabic
-maccentraleurope
-maccroatian
-maccyrillic
-macgreek
-machebrew
-maciceland
@@ -345,5 +189,0 @@
-macroman
-macromania
-macthai
-macturkish
-macukraine
@@ -356,0 +197,3 @@
+ms50221
+ms51932
+ms932
@@ -358,3 +200,0 @@
-ms_kanji
-mulelao-1
-nextstep
@@ -363,4 +203,2 @@
-r8
-riscos-latin1
-rk1048
-roman8
+shifft_jis
+shifft_jis-ms
@@ -369,2 +206,0 @@
-shift_jis-2004
-shift_jisx0213
@@ -372,12 +208,3 @@
-strk1048-2002
-tcvn
-tcvn-5712
-tcvn5712-1
-tcvn5712-1:1993
-tds565
-tis-620
-tis620
-tis620-0
-tis620.2529-1
-tis620.2533-0
-tis620.2533-1
+sjis-ms
+sjis-open
+sjis-win
@@ -385,2 +211,0 @@
-ucs-2-internal
-ucs-2-swapped
@@ -390,2 +214,0 @@
-ucs-4-internal
-ucs-4-swapped
@@ -393,0 +217,6 @@
+ucs2
+ucs2be
+ucs2le
+ucs4
+ucs4be
+ucs4le
@@ -395,4 +224 @@
-unicode-1-1
-unicode-1-1-utf-7
-unicodebig
-unicodelittle
+unicodefffe
@@ -407 +232,0 @@
-utf-7
@@ -409,2 +234,7 @@
-viscii
-viscii1.1-1
+utf16
+utf16be
+utf16le
+utf32
+utf32be
+utf32le
+utf8
@@ -420,0 +251,3 @@
+windows-31j
+windows-50221
+windows-51932
@@ -421,0 +255 @@
+windows-932
@@ -423,3 +257,42 @@
-x0201
-x0208
-x0212
+x-chinese_cns
+x-cp20001
+x-cp20003
+x-cp20004
+x-cp20005
+x-cp20261
+x-cp20269
+x-cp20936
+x-cp20949
+x-cp50227
+x-ebcdic-koreanextended
+x-europa
+x-ia5
+x-ia5-german
+x-ia5-norwegian
+x-ia5-swedish
+x-iscii-as
+x-iscii-be
+x-iscii-de
+x-iscii-gu
+x-iscii-ka
+x-iscii-ma
+x-iscii-or
+x-iscii-pa
+x-iscii-ta
+x-iscii-te
+x-mac-arabic
+x-mac-ce
+x-mac-chinesesimp
+x-mac-chinesetrad
+x-mac-croatian
+x-mac-cyrillic
+x-mac-greek
+x-mac-hebrew
+x-mac-icelandic
+x-mac-japanese
+x-mac-korean
+x-mac-romanian
+x-mac-thai
+x-mac-turkish
+x-mac-ukrainian
+x_chinese-eten

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]