This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ITP: rxvt-unicode-X


Thomas Wolff wrote:
Sorry for the very late response, but I've finally succeessfully pursuaded rxvt-unicode now to actually support Unicode on cygwin, and I'd like to suggest to include that in the package.

That's great, thank you very much. I received your other emails and will take a look as soon as possible. However, I'll let the brand new (not even announced yet) rxvt-unicode-X package stay as-is for a while to give folks a chance to try it out before incorporating any new features/changes.


Some general remarks:
Depending on the application, Unicode may be triggered either
1) explicitly or
2) using the locale mechanism (which is bogus on cygwin).
It should be noted that the set of locale variables (LC_* and LANG) are not identical to the locale mechanism which needs addtional library support.


1) For example, xterm has an explicit command line option:
xterm -u8
which invokes xterm in UTF-8 mode. Additional configuration is needed to use Unicode fonts. And LC_* variables are unfortunately not set implicitly in this invocation mode which confuses many applications.


My package mined includes a script uterm which invokes xterm in a suitable mode, including font setup. Cygwin/X does include some Unicode fonts, but apparently a very outdated version of them with a very limited character range. I would offer to maintain a package of Unicode X fonts if that helps.

2) Rxvt insists on locale configuration to provide desired encodings.
This means, you would have to invoke rxvt like this:
LC_CTYPE=en_US.UTF-8 rxvt
or
LC_ALL=vi_VN rxvt
(Note: vi_VN is one of the UTF-8 locales that lack the usual indication suffix.)
And rxvt would run in UTF-8 mode where the locale mechanism works (which it doesn't on cygwin).

So, you're saying that rxvt-unicode doesn't have an explicit switch, but relies on pre-existing env vars. This is good, because the apps one runs IN the terminal will need those env vars too, something a command line switch won't set for you properly anyway.


BUT...

The reason why I couldn't trick out rxvt before by just setting the variables was that it also depends on the wide character library functions which in turn depend on a working locale mechanism.

if the wide char library functions don't exist, then rxvt ignores the LC vars anyway. Gotcha.


I have now replaced those functions (well, the subset of them needed by rxvt) with substitutes that either operate in UTF-8 mode, or delegate to the system functions, depending on the setting of the locale variables, and it works.

Shims -- that's a reasonable approach. (I'd prefer if unicode/locale support were added to cygwin's version of newlib but that might be Augean Stables-level of effort.) OTOH, I *really* prefer things-that-work, sooner rather than later -- so this is good.


At least it does so for display, although it suppresses 8-bit input for some obscure reason still to be found.

I'm just guessing, but this could be related to the configure settings in my build script, if that's what you were using:


--enable-shared --enable-utmp --enable-wtmp --enable-lastlog \
--enable-xft --enable-font-styles --disable-xim --enable-combining \
--enable-fallback=Rxvt --with-res-name=urxvt --with-res-class=URxvt \
--program-suffix=-X \
--enable-xpm-background --enable-menubar --enable-rxvt-scroll \
--enable-next-scroll --enable-xterm-scroll --enable-plain-scroll \
--enable-transparency --enable-tinting --enable-fading \
--enable-frills --enable-smart-resize --enable-pointer-blank \
--enable-mousewheel --enable-slipwheeling --enable-keepscrolling \
--enable-old-selection --disable-perl \
--with-xpm-includes=/usr/X11R6/include --with-xpm-library=/usr/X11R6/lib \
--x-libraries=/usr/X11R6/lib



Note: --disable-xim as well as not specifying --enable-8bitctrls


Now, the latter is "not recommended" and its only effect is the following block of code in the input-processing loop:

#ifdef EIGHT_BIT_CONTROLS
      // 8-bit controls
      case 0x90:        /* DCS */
        process_dcs_seq ();
        break;
      case 0x9b:        /* CSI */
        process_csi_seq ();
        break;
      case 0x9d:        /* CSI */
        process_osc_seq ();
        break;
#endif

So, I don't think that's it.

=====

While 8bit input != xim, there are two things I've discovered about the rxvt-unicode sourcecode:
(1) very little testing is done in non-default configurations (and --enable-xim is the default)
(2) some #define macros turn on/turn off more than their simple names and descriptions might suggest -- and the code often makes unwarranted assumptions (e.g. see earlier thread about an unwarranted linkage between transparency and XPM support)


So, it's possible that --disable-xim turns off some non-XIM input support needed for 8bit entry.

Try: --enable-xim.
=====

Also, try the iso14755 support (CTRL-SHFT-key). Maybe that helps?

=====

Finally, input is a cooperative affair between the terminal, the shell, and for X11 terminals, the Xserver. In the case of bash, that also includes readline. How's your ~/.inputrc set up?

     # don't strip characters to 7 bits when reading
     set input-meta on

     # allow iso-latin1 characters to be inserted rather
     # than converted to prefix-meta sequences
     set convert-meta off

     # display characters with the eighth bit set directly
     # rather than as meta-prefixed characters
     set output-meta on

Also, are you sure that the "meta" key is what you think it is? You can force it by using the -mod cmdline option of rxvt-unicode (see that urxvt manpage). I think the cygwin Xserver defaults to using Alt.

And then, there's the -meta8 cmdline option to rxvt-unicode:

     meta8: boolean
          True: handle Meta (Alt) + keypress to set the 8th bit.
	  False: handle Meta (Alt) + keypress as an escape prefix
	   [False is default].

Maybe you want True?

I will send the files to you (Charles Wilson) directly and would appreciate if you confirm the solution.

Quick perusal looks pretty good. I like the caching of is_u_utf8_mode, but you should watch out: --enable-frills turns on
'locale switching escape sequence'
so you might need to add a hook in that handler to "un-cache".


--
Chuck


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]