This is the mail archive of the
cygwin-apps
mailing list for the Cygwin project.
Re: ITP: rxvt-unicode-X
- From: Charles Wilson <cygwin at cwilson dot fastmail dot fm>
- To: Mailing List: CygWin-Apps <cygwin-apps at cygwin dot com>
- Date: Thu, 11 May 2006 00:48:23 -0400
- Subject: Re: ITP: rxvt-unicode-X
- References: <441F9CCD.70400@cwilson.fastmail.fm> <200603211640.k2LGecHO013433@ns-srv-2.bln1.siemens.de> <1142978051.5960.257188645@webmail.messagingengine.com> <0ML25U-1FdvLA31wO-0008Nb@mrelayeu.kundenserver.de>
Thomas Wolff wrote:
Sorry for the very late response, but I've finally succeessfully
pursuaded rxvt-unicode now to actually support Unicode on cygwin,
and I'd like to suggest to include that in the package.
That's great, thank you very much. I received your other emails and
will take a look as soon as possible. However, I'll let the brand new
(not even announced yet) rxvt-unicode-X package stay as-is for a while
to give folks a chance to try it out before incorporating any new
features/changes.
Some general remarks:
Depending on the application, Unicode may be triggered either
1) explicitly or
2) using the locale mechanism (which is bogus on cygwin).
It should be noted that the set of locale variables (LC_* and LANG)
are not identical to the locale mechanism which needs addtional
library support.
1) For example, xterm has an explicit command line option:
xterm -u8
which invokes xterm in UTF-8 mode. Additional configuration is
needed to use Unicode fonts. And LC_* variables are unfortunately
not set implicitly in this invocation mode which confuses many
applications.
My package mined includes a script uterm which invokes xterm in a
suitable mode, including font setup. Cygwin/X does include some
Unicode fonts, but apparently a very outdated version of them with
a very limited character range. I would offer to maintain a package
of Unicode X fonts if that helps.
2) Rxvt insists on locale configuration to provide desired encodings.
This means, you would have to invoke rxvt like this:
LC_CTYPE=en_US.UTF-8 rxvt
or
LC_ALL=vi_VN rxvt
(Note: vi_VN is one of the UTF-8 locales that lack the usual
indication suffix.)
And rxvt would run in UTF-8 mode where the locale mechanism
works (which it doesn't on cygwin).
So, you're saying that rxvt-unicode doesn't have an explicit switch, but
relies on pre-existing env vars. This is good, because the apps one
runs IN the terminal will need those env vars too, something a command
line switch won't set for you properly anyway.
BUT...
The reason why I couldn't trick out rxvt before by just setting the
variables was that it also depends on the wide character library
functions which in turn depend on a working locale mechanism.
if the wide char library functions don't exist, then rxvt ignores the LC
vars anyway. Gotcha.
I have now replaced those functions (well, the subset of them needed
by rxvt) with substitutes that either operate in UTF-8 mode, or
delegate to the system functions, depending on the setting of the
locale variables, and it works.
Shims -- that's a reasonable approach. (I'd prefer if unicode/locale
support were added to cygwin's version of newlib but that might be
Augean Stables-level of effort.) OTOH, I *really* prefer
things-that-work, sooner rather than later -- so this is good.
At least it does so for display,
although it suppresses 8-bit input for some obscure reason still to be
found.
I'm just guessing, but this could be related to the configure settings
in my build script, if that's what you were using:
--enable-shared --enable-utmp --enable-wtmp --enable-lastlog \
--enable-xft --enable-font-styles --disable-xim --enable-combining \
--enable-fallback=Rxvt --with-res-name=urxvt --with-res-class=URxvt \
--program-suffix=-X \
--enable-xpm-background --enable-menubar --enable-rxvt-scroll \
--enable-next-scroll --enable-xterm-scroll --enable-plain-scroll \
--enable-transparency --enable-tinting --enable-fading \
--enable-frills --enable-smart-resize --enable-pointer-blank \
--enable-mousewheel --enable-slipwheeling --enable-keepscrolling \
--enable-old-selection --disable-perl \
--with-xpm-includes=/usr/X11R6/include
--with-xpm-library=/usr/X11R6/lib \
--x-libraries=/usr/X11R6/lib
Note: --disable-xim as well as not specifying --enable-8bitctrls
Now, the latter is "not recommended" and its only effect is the
following block of code in the input-processing loop:
#ifdef EIGHT_BIT_CONTROLS
// 8-bit controls
case 0x90: /* DCS */
process_dcs_seq ();
break;
case 0x9b: /* CSI */
process_csi_seq ();
break;
case 0x9d: /* CSI */
process_osc_seq ();
break;
#endif
So, I don't think that's it.
=====
While 8bit input != xim, there are two things I've discovered about the
rxvt-unicode sourcecode:
(1) very little testing is done in non-default configurations (and
--enable-xim is the default)
(2) some #define macros turn on/turn off more than their simple names
and descriptions might suggest -- and the code often makes unwarranted
assumptions (e.g. see earlier thread about an unwarranted linkage
between transparency and XPM support)
So, it's possible that --disable-xim turns off some non-XIM input
support needed for 8bit entry.
Try: --enable-xim.
=====
Also, try the iso14755 support (CTRL-SHFT-key). Maybe that helps?
=====
Finally, input is a cooperative affair between the terminal, the shell,
and for X11 terminals, the Xserver. In the case of bash, that also
includes readline. How's your ~/.inputrc set up?
# don't strip characters to 7 bits when reading
set input-meta on
# allow iso-latin1 characters to be inserted rather
# than converted to prefix-meta sequences
set convert-meta off
# display characters with the eighth bit set directly
# rather than as meta-prefixed characters
set output-meta on
Also, are you sure that the "meta" key is what you think it is? You can
force it by using the -mod cmdline option of rxvt-unicode (see that
urxvt manpage). I think the cygwin Xserver defaults to using Alt.
And then, there's the -meta8 cmdline option to rxvt-unicode:
meta8: boolean
True: handle Meta (Alt) + keypress to set the 8th bit.
False: handle Meta (Alt) + keypress as an escape prefix
[False is default].
Maybe you want True?
I will send the files to you (Charles Wilson) directly and would
appreciate if you confirm the solution.
Quick perusal looks pretty good. I like the caching of is_u_utf8_mode,
but you should watch out: --enable-frills turns on
'locale switching escape sequence'
so you might need to add a hook in that handler to "un-cache".
--
Chuck