This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH/WIP] C/C++ wchar_t/Unicode printing support
- From: Tom Tromey <tromey at redhat dot com>
- To: Joel Brobecker <brobecker at adacore dot com>
- Cc: Julian Brown <julian at codesourcery dot com>, gdb-patches at sourceware dot org
- Date: Sun, 01 Feb 2009 15:40:29 -0700
- Subject: Re: [PATCH/WIP] C/C++ wchar_t/Unicode printing support
- References: <20090115202411.5f154657@rex.config> <m37i4dx64p.fsf@fleche.redhat.com> <20090130194343.GA3964@adacore.com> <m3ocxos6og.fsf@fleche.redhat.com> <20090201182344.GD4597@caradoc.them.org>
- Reply-to: Tom Tromey <tromey at redhat dot com>
>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:
>> I suppose one option would be to have a degraded mode where we require
>> that the host charset and the target charset be the same. Then maybe
>> we could make it work by redefining iswprint and wchar_t.
Daniel> I don't see the connection between the iconv dependency and
Daniel> iswprint / wchar_t. Are there portability issues for those
Daniel> too? They don't come from libiconv.
Yeah, there isn't a direct connection.
Basically there are two problems to solve.
One is having a way to convert from a target charset of some kind to a
host charset of some kind.
The other issue is deciding how to print things on the host. We want
to use a host wide character of some sort, so that we can print a
larger subset of characters on a capable terminal. This also lets us
handle "set print repeat", on some platforms anyway, without needing
details about a possible host-side variable-length encoding.
I chose to solve the first problem by using iconv for all the
conversions, and the second by using wchar_t and iswprint for host
printability decisions.
There may be portability issues for the use of wchar_t and iswprint.
I don't know. It would be helpful if someone with access to the more
exotic hosts out there could take a look.
Daniel> It seems like a dummy version of iconv_open which only succeeds if the
Daniel> two character sets are the same, plus a pass-through version of iconv,
Daniel> would be enough to remove the iconv dependency. That degraded mode
Daniel> covers all local debugging.
The wchar_t issue comes into play because we actually do two
conversions when printing: one from the target charset to the host
wchar_t, and then a second one from the host wchar_t to the host
"narrow" charset.
This just adds a wrinkle to the implementation, though -- the general
plan still applies. We could either pretend that wchar_t == char, or
we could make an iconv that uses the mb* functions.
I can implement this, but I'd rather do it only if it is truly needed.
How are you planning to handle this for Code Sourcery? Really I would
like to hear the answer to this from anybody shipping a gdb
executable.
I suppose my recommendation would be to put GNU libiconv into your
local tree, with some configury tweaks to make it build a static
library. This does not seem very hard, though I suppose it is only
suitable if you are not too concerned about the resulting executable
size.
Another portability question is whether there is a platform that does
not have iconv at all. If every host we care about has some form of
iconv, even a bad one, perhaps we don't have to worry much -- users
could still have a functional-for-basic-native-debugging gdb.
Daniel> There'd need to be a little additional
Daniel> logic too, to allow you to set all the charset variables at
Daniel> once
I think "set charset" already does this. It doesn't handle the target
wide charset, but that seems ok in the degraded functionality mode.
Tom