This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/22073] charmaps/UTF-8: wcwidth of U+00AD (soft hyphen): 0 or 1 ?


https://sourceware.org/bugzilla/show_bug.cgi?id=22073

--- Comment #6 from Thorsten Glaser <tg at mirbsd dot de> ---
Unicode does NOT define the column width of a char in the terminal. This shows
in all those mailing list threads, in which they basically assume all fonts to
be proportional.

wcwidth() however basically *is* the column width of a char in the terminal in
a fixed-width cell layout.

The cōnsēnsus seems to be to ask _users_ avoid using U+00AD because of the two
different histories in interpretation, and use something else for the separate
purposes. That leaves us with needing a definition for this char *should* it
appear anywhere still.

I’m arguing for 1 because:

• 0 is for combining characters and NUL only
• the “possible soft hyphen” reading of U+00AD is not a combining character
• compatibility with previous/older/other wcwidth() implementations, most
importantly

The 0 fraction should not be at a loss here because:

• The char should be avoided already *anyway*
• Terminal emulators never implement wrapping at a “possible soft hyphen”, only
at the end of the line
• Unicode data is still available elsewhere, this bugreport is precisely about
wcwidth() which only “almost” aligns with the various Unicode datas (yes, I
know, wrong plural, but I can’t think of anything better to express what I
mean, right now)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]