This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: printing wchar_t*

On Friday 14 April 2006 18:29, Eli Zaretskii wrote:
> > Date: Fri, 14 Apr 2006 09:43:01 -0400
> > From: Paul Koning <>
> > Cc:,
> >
> > If you have 16 bit wide chars, it seems possible that those might
> > contain UTF-16 encoding of full (beyond BMP) Unicode characters.
> You could use wchar_t arrays for that, but then not every array
> element will be a full character, and you will not be able to access
> individual characters by their positional index.

And what? Even if wchar_t is 32 bit then element at position 'i' can be 
combining character modifying another character, and be of little use itself.

> In other words, in this case each element of the wchar_t array is no
> longer a ``wide character'', but one of the few shorts that encode a
> character.
> If we want to support wchar_t arrays that store UTF-16, we will need
> to add a feature to GDB to convert UTF-16 to the full UCS-4
> codepoints, and output those.  

That's what I mentioned in a reply to Jim -- since the current string printing 
code operated "one wchar_t at a time", it's not suitable for outputing UTF-16 
encoded wchar_t values to the user.

> Alternatively, the FE will have to 
> support display of UTF-16 encoded characters.

Speaking about FE, handling UTF-16 is trivial, so printing just wchar_t values 
will be sufficient. Only if we want to properly show UTF-16 strings to a user 
of console gdb, some work may be necessary.

- Volodya

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]