This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: UCS data encoding in localedata
On Fri, Apr 13, 2012 at 06:58:35PM -0400, Mike Frysinger wrote:
> On Friday 06 April 2012 04:18:27 Petr Baudis wrote:
> > Does anyone know the technical reason for using the explicit <U0000>
> > UCS encoding in localedata instead of some sane approach like UTF8
> > encoded data? I can think of only historical reasons due to the lack
> > of support in tools (OS, editors, VCS, ...) in the past, however I
> > believe that by now, using UTF8 should be fairly safe.
>
> http://sourceware.org/ml/libc-alpha/2003-04/msg00043.html
>
> > For me, even with the show-ucs-data tool, deadling with localedata
> > files is quite onerous. Can anyone share any other tricks they use
> > when dealing with localedata? Would there be any resistance to moving
> > to UTF8?
>
> seems like we could treat it the same way we already do with "configure" and
> "configure.in". the .in files could be UTF8, and we'd have simple tools to
> "generate" the non-.in files which we'd also commit to git.
I don't think that thread has any real explanation or reasons besides
"it has to be that way", and I don't buy it. I think in 2003, the world
was still quite more complicated than it is now. Even back then, EBCDIC
was not that common anymore, so I still do not follow what is wrong with
using plain representation at least for ASCII characters.
> btw, what's with cc-ing libc-announce ?
That's my mistake, I meant to Cc libc-locales. In the next try, it was
libc-locale, also wrong... :-)
"Fortunately," noone seems to have looked at libc-announce moderation
queue for a few years now.
--
Petr "Pasky" Baudis
Smart data structures and dumb code works a lot better
than the other way around. -- Eric S. Raymond