This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

utf/codepage-conversion on cmd-window


I came across a surprising behaviour when a cygwin-process prints to a
windows-console. In cygwin 1.7 (XP) I don't think any conversion took 
place, while now in cygwin 2 (10) cygwin-utf-8 gets converted into the
suitable(?) windows-encoding used in the cmd-window.

I have a file containing german umlaut-characters encoded in utf-8 and
cp850.

When I'm in the cmd-window:

C:\bat>type cmduml.txt
âÃâÃâÂâÃâÃâââÆ      -> some cp850-characters with 
high-bit set
ÃÃÃÃÃÃÃ              -> correct output

C:\bat>\cygwin\bin\cat cmduml.txt
ÃÃÃÃÃÃÃ               -> utf converted to cp850
ÂÂÂÂÂÂÂ                            -> ???

C:\bat>chcp
Aktive Codepage: 850.

It makes things easier the way it is now, but I could not find it 
documented. Also I wonder about cygwin's output of the 
cp-850-characters, I'd expect them to be printed unchanged, instead I 
only see grey rectangles.

In the cygwin-window the file is:

/c/bat|17:05:30#od -x cmduml.txt
0000000 96c3 84c3 9cc3 b6c3 a4c3 bcc3 9fc3 990a
0000020 9a8e 8494 e181 0a0d
0000030

I've also attached it.

I hope all is displayed correctly, but it should be easy to reproduce.

-Helmut

Attachment: cmduml.txt
Description: Binary data

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]