This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/WIP] C/C++ wchar_t/Unicode printing support


>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

>> I suppose one option would be to have a degraded mode where we require
>> that the host charset and the target charset be the same.  Then maybe
>> we could make it work by redefining iswprint and wchar_t.

Daniel> I don't see the connection between the iconv dependency and
Daniel> iswprint / wchar_t.  Are there portability issues for those
Daniel> too?  They don't come from libiconv.

Yeah, there isn't a direct connection.

Basically there are two problems to solve.

One is having a way to convert from a target charset of some kind to a
host charset of some kind.

The other issue is deciding how to print things on the host.  We want
to use a host wide character of some sort, so that we can print a
larger subset of characters on a capable terminal.  This also lets us
handle "set print repeat", on some platforms anyway, without needing
details about a possible host-side variable-length encoding.

I chose to solve the first problem by using iconv for all the
conversions, and the second by using wchar_t and iswprint for host
printability decisions.

There may be portability issues for the use of wchar_t and iswprint.
I don't know.  It would be helpful if someone with access to the more
exotic hosts out there could take a look.

Daniel> It seems like a dummy version of iconv_open which only succeeds if the
Daniel> two character sets are the same, plus a pass-through version of iconv,
Daniel> would be enough to remove the iconv dependency.  That degraded mode
Daniel> covers all local debugging.

The wchar_t issue comes into play because we actually do two
conversions when printing: one from the target charset to the host
wchar_t, and then a second one from the host wchar_t to the host
"narrow" charset.

This just adds a wrinkle to the implementation, though -- the general
plan still applies.  We could either pretend that wchar_t == char, or
we could make an iconv that uses the mb* functions.

I can implement this, but I'd rather do it only if it is truly needed.

How are you planning to handle this for Code Sourcery?  Really I would
like to hear the answer to this from anybody shipping a gdb
executable.

I suppose my recommendation would be to put GNU libiconv into your
local tree, with some configury tweaks to make it build a static
library.  This does not seem very hard, though I suppose it is only
suitable if you are not too concerned about the resulting executable
size.

Another portability question is whether there is a platform that does
not have iconv at all.  If every host we care about has some form of
iconv, even a bad one, perhaps we don't have to worry much -- users
could still have a functional-for-basic-native-debugging gdb.

Daniel> There'd need to be a little additional
Daniel> logic too, to allow you to set all the charset variables at
Daniel> once

I think "set charset" already does this.  It doesn't handle the target
wide charset, but that seems ok in the degraded functionality mode.

Tom


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]