This is the mail archive of the cygwin@sourceware.cygnus.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: ASCII and BINARY files. Why?


Jeffrey C. Fried wrote:
> 
> Grant,
> 
> If i've been following this thread correctly,

I don't think you have been, since you are responding to my comments,
not Grant's.

> i have to disagree with you.
> Over the years i've worked for prolonged periods on 5 different operating
> systems (Unix, Primos, VMS, Windows-NT/95/3.1, Exec-8).  Each had its own
> way of handling text vs. binary issues.  On 4 of those operating systems i
> had access to many, if not all, of the usual Unix text processing tools.
> In all cases these tools were adapted so that they utilized the text mode
> of the host machine, at least until now with Cygwin-32.  At no time in all
> of those cases was my work in any way impeded by the fact that these tools
> were only consistently useful on text files.  In fact i know of no one at
> any of the very large companies at which i worked who felt any loss because
> of this arrangement.  Consistent with that philosophy, please note that GNU
> emacs has always worked in the text mode of the operating system under
> which it ran (VMS, DOS, Unix, Primos).  And i think that is the approach
> which should be taken with respect to Cygwin-32.

There are some points that you are missing.  On VMS at least, files have
types.  The normal "text" files are record oriented, and are marked as
such in the file header.  They must be read as "text" files since they
contain record lengths instead of newlines.  But VMS also has "stream"
files, which are also marked as such, so the tools, under either IS/WB
(which I wrote and designed major portions of) or Eunice, the two major
"unix lookalikes" under VMS, the tools work perfectly well on all kinds
of files.  The same cannot be said for DOS text/binary which *cannot be
distinguished*.  The text/binary transformation in the IS/WB tools was
exact; you could cat any number of text or binary files together, write
the result as either a text or binary file, and always get the exact
same data back when you read the file.  Nothing of the sort is possible
here.

emacs, is primarily a text-editing tool; that simply is not the case for
cat, tr, etc.  It is not a filter; it is not an appropriate example.  Of
course one can point to less or emacs or vi and say "hey, don't need to
work with binary files", but that's a specious argument.  But even so,
part of emacs' philosophy is that everything is accessible by the user,
so it can be used for binary editing, and many versions have facilities
such as a binary-open flag or a way to mark certain file names (e.g.,
*.exe) as binary, or a mechanism that checks whether input lines end
with crnl and writing them out appropriately.  That makes it look like
they work on "native" files, while they still work on (most) unix-like
files.  And emacs does all its own I/O, opens all its files in *raw*
mode, and makes its own decisions.  That is nothing like having the
library force all programs into "text" mode "by default".

Many of the tools are filters that work perfectly well when in "binary"
mode with any sort of file.  cat in binary mode will work just fine with
any file, as will tr or wc or head or sort, etc. etc.  The same can
hardly be said of "text" mode.  binary mode is only a problem for
programs that generate original lines.  Such programs can be passed
through filters or be modified to produce CRNL's or to explicitly open
the output in "text" mode.

Which brings up the point that you worked with commercial tool
collections that were modified to adapt to the native environment,
whereas one of the goals of GNU-win32 has been for GNU code to work "out
of the box".  The best way to achieve that is to *default* to unix
semantics.  Given that you are correct that people did not experience
problems with the tools, there is a good chance that they were modified
as necessary to work with arbitrary files.  That is certainly the case
for the VMS systems I worked with.  The devil is in the details, and you
haven't indicated that you are familiar with them.

Finally, your experience is not universal; your needs are not my needs.
And the argument that you "know of no one" is quite fallacious, an
argumentum ad ignorantiam.  There are a lot of things that you don't
know about your cow orkers, and are not likely to know unless you go out
of your way to find out.  Remember, I *developed* one of those cross
systems, and I know that *customers* were upset when programs failed to
work as expected, *either* with native features (we had to add VMS
record entries to ar files, preserve them via cp, etc.) *or* with "unix"
features (console emulation, pipes, binary data transparency were all
major issues).  Just saying "I don't know anyone" doesn't cut it, any
more than people calling in to CSPAN who think the election was rigged
because "I don't know anyone who voted for Bill Clinton".

> It is not a matter of the "average" Win95/NT user, it is a matter for most
> of the Win95/NT developers who would like to have access to these tools.
> If they have to constantly remember to convert files between the two
> environments, they will not use the tools because of the inconvenience and
> the tendency to forgot to translate the files into the "current"
> environment.

If you had been "following this thread correctly", you would know that
this is much less of a problem than it is cracked up to be.  It simply
isn't necessary to convert files if you avoid programs such as notepad
that can't handle the absence of CR's.  And most GNU tools will do quite
reasonable things with files containing CR's, and those that don't can
be fixed.

> While it may be important to you to be able to use these tools
> as if you were operating under Unix, i think it is for more important to
> most of us that it work within the same environment, producing and
> consuming the same files, as the the host OS.

It is important to be able to do both, and self-serving claims about
"most of us" will not do.  That is why I have suggested various ways of
identifying "text" files, so that one can process in "host" mode even
when the default is "binary", and have suggested that those tools that
cannot handle CR's should be fixed.

--
<J Q B>
-
For help on using this list, send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]