This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Extra spaces in text files in cygwin


On 2008-06-11, gmarsha11 wrote:
> Gary Johnson wrote:
> > 
> > On 2008-06-10, gmarsha11 wrote:
> >> 
> >> Does this mean it's necessary to change the encoding for any files I
> >> might
> >> need to cat, grep awk, etc.?
> > 
> > I'm no expert on any of this, but as far as I know, all traditional 
> > Unix tools that deal with strings consider a string to be a sequence 
> > of 8-bit characters.  So the simple answer is yes.  The more 
> > complete answer is that it depends on what you're using those files 
> > for and what other programs need to read and/or write those files.
> > 
> 
> The files are being created by HP Data Protector (backup management
> software).  After I changed the file, I realized that the next time DP
> modifies it, it will change the encoding.  DP can read the file when it is
> ANSI encoded, but will always write in Unicode -- unless I can find out how
> to change the encoding it uses.

Bummer.  If you don't need to use grep, etc., with these files very 
often, you might just prefix those Unix commands with iconv, as 
someone else suggested, e.g.,

   iconv -f csunicode abc.txt | grep abc

Note that files that I save in Unicode from Notepad do not have an 
EOL sequence after the last line.  If HP Data Protector does the 
same, that might cause a problem with some tools.

HTH,
Gary


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]