This is the mail archive of the mailing list for the GDB project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: printing wchar_t*

On Friday 14 April 2006 17:59, Eli Zaretskii wrote:

> > In an original post, I've asked if gdb can print wchar_t just as a raw
> > sequence of values, like this:
> >
> >     0x56, 0x1456
> The answer is YES.  Use array notation, and add a feature to report
> the length of a wchar_t array.


> Now, the same letter ``small a'' can be encoded in several other ways:
> for example, its ISO-2022-7bit encoding is 0x1B 0x24 0x2C 0x31 0x28
> 0x50, its KOI8-r encoding is 0xC1, its ISO-8859-5 encoding is 0xD0,
> etc.  It should be obvious that, of all the encodings, only the
> fixed-length ones can be used in a wchar_t array (because wchar_t
> arrays are stateless, 

I don't think this statement is backed up by anything.

> This is why I said that wchar_t is not used for an encoding (such as
> ISO-8859-5 or UTF-8 or UTF-16), but for characters' codepoints.  It is
> nowadays almost universally accepted that wchar_t is a Unicode
> codepoint, 

Again, can you provide any specific pointers to support that view?

> the only difference between applications being whether only 
> the first 64K characters (the so-called BMP) are supported by 16-bit
> wchar_t, or the entire 23-bit range is supported by a 32-bit wchar_t.

I believe that on Windows:

- wchar_t is 16-bit
- wchar_t* values are supposed to be in UTF-16 encoding

Do you disagree with any of the above statements? If not, then it directly 
follows that a given wchar_t is not a Unicode code point, but a code unit in 
specific representation (UTF-16), and a given code points takes either one or 
two code units, that is either one or two wchar_t. This is contrary to your 
statement that wchar_t is a single code point.

Anyway, this is quickly getting off-topic for gdb list, so maybe we should 
bring this somewhere else.

- Volodya

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]