This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
RE: OT: possible project/research project

From: Randall R Schulz <rrschulz at cris dot com>
To: "Robert Collins" <robert dot collins at itdomain dot com dot au>, <cygwin at cygwin dot com>
Date: Wed, 20 Mar 2002 00:33:31 -0800
Subject: RE: OT: possible project/research project
Robert,

Responses interposed below.


At 22:55 2002-03-19, Robert Collins wrote:
> > -----Original Message-----
> > From: Randall R Schulz [mailto:rrschulz@cris.com]
> > Sent: Wednesday, March 20, 2002 12:15 PM
>
> >
> > Robert,
> >
> > This idea isn't really new.
>
>I don't recall claiming it as 'new' .. just an idea.  :} (ok, pedant mode 
>off).
>
> > The problem is that you're creating a huge project that
> > creates no new
> > functionality and that has horrendous maintainence issues, as you say.
>
>Yah. That's the crux. I've no interest in creating such a project.
>
> > The library conversion idea is kind of a throwback to
> > pre-Unix days or to
> > systems like VMS (if I recall and understand it properly). In
> > these systems
> > there were "blessed" commands understood by the command
> > interpreter and
> > endowed with a more direct means of invocation. Other
> > commands required
> > full sub-process creation.
>
>Well we still have that basic separate - bash's builtin's for example. If 
>it's not builtin, it needs a sub process.

That's not quite right. Built-ins still need sub-processes if they're going 
to operate in a pipeline or are enclosed within parentheses.


> > I trust it's your intent that the user will see no obvious
> > differences in
> > invoking these programs, but you may find full transparency harder to
> > achieve than you expect. Will the full range of shell
> > features be available
> > to these specially integrated commands?
>
>That is the design goal, should such a project be attempted by me.
>
> > Will you be able to
> > pipe into and
> > out of them?
>
>Yes.
>
> > Will they work within parentheses?
>
>Yes.
>
> > In
> > procedures?
>
>Yes.
>
> > Will you
> > allow all shell features (pipes, say) are applied to
> > arbitrary combinations
> > of conventional and integrated commands?
>
>Yes.
>
>Before you think I've got plans to big for my boots, consider that if we 
>leverage an existing shell, all those feature work out-of-process now. 
>proxying each feature into a library-capable equivalent one at a time 
>would allow a serious fallback mechanism for any functionality gaps.
>
> > In your example of a `backquote command` (which I prefer to
> > invoke via $( ... ) using BASH)
>
>Not being a shell afficiondo, I'm happy to be educated: what is the key 
>difference between `...` and $(...)?

Same functionality, but $( ) constructs nest.


> > you'd be exposed to any unintended
> > side-effects within
> > the backquote command. Side-effects like file descriptor alterations,
> > changes in signal dispositions, receipt of signals or
> > exceptions (expected
> > or the result of a programming error).
>
>Well that's the point of the 'push_context' pseudo command, to save all 
>that information, and then restore it. It may need some 'OS' co-operation 
>to fully achieve that - ie stdin being kept intact (although I imagine 
>that the librarised commands would be given a virtual stdin, as they are 
>sub process's after all) -  but we have the source so....

How will your magical push_context protect from wild pointer references, e.g.?


> > The beauty of the fork/exec model with entirely separated
> > programs _is_
> > their self-containedness and the complete independence and
> > isolation each
> > of the programs gets from each other and from the program(s)
> > that invoke
> > them.
>
>The fork()/exec() model bites. Sorry, but it does. fork() based servers 
>for instance run into the galloping herd - and scale very badly. The other 
>use for fork -the fork/exec combination is better achieved with spawn() 
>which is designed to do just that one job well. It also happens to work 
>very well on cygwin, and I see no reason to change that. So spawned apps 
>will remain completely separated and independent.

Servers are not shells. Why should they fork at all? That's what threads 
are for. It's also why CGI (without something like mod_perl) is not a good 
thing and the Java server model has significant advantages.

Are you planning on incorporating your scheme into every program that runs 
sub-processes on a regular basis? How likely is it that what works in one 
shell will work in another or in a server?

I don't know the details of spawn(). How does it accomplish I/O redirection?


> > It is also nice in that it is a very simple programming
> > model for commands, both built-in and end-user-supplied, that run
> > within it.
>
>I don't see how this idea detracts from that. Do you think that the 
>presence of a librarised 'ls' command (for instance) will prevent the user 
>adding perl to their system? Or replacing ls? Either scenario is abhorrent 
>to me.

Obviously if you add something, the old stuff isn't (necessarily) lost. I'm 
just saying that the fork/exec process model is simple, elegant, available, 
universal and fully functional in all POSIX systems. Your model is a horse 
of another color and any given command that would avail itself of the 
supposed benefits of your scheme must be recast into a library that 
conforms to the requirements of your embedded task model.


> > It is probably less platform-specific than a scheme that demands use of
> > dynamically-linked / shared libraries.
>
>Ermm, I guarantee I'll be using libtool if I do this...
>
> > The Unix shell and process model may be somewhat costly of computing
> > resources (but only marginally so), especially as I said without
> > copy-on-write behavior in the fork call, but that rather
> > modest down-side is more than made up for by independence, modularity, and
> > open-endedness of the scheme.
>
>I grant that independence, modularity and open-endedness are wonderful 
>things. Can you please describe how what I have suggested prevents any one 
>of the above? The whole point of a librarised approach is to make the 
>shell modular. That also grants open-endedness for free. As for 
>independence, if none of the libraries are available, then the whole thing 
>would run as a normal shell, with no in-process behaviour.

It doesn't prevent it, but to avail ones self of the putative benefits of 
your proposed scheme, a significantly different programming model has to be 
learned and used. All for what? A tiny incremental improvement in program 
start-up times on a single platform and one or two pre-ordained shells?


> > I can't see how all the work your idea implies just for the
> > sake of some incremental performance improvements is going to be 
> worthwhile.
>
>Well that's arguable :}. If it takes 100 mythical man-months to create 
>this beast and libraries the top 20 shell tools.... how many users can use 
>this, and how much time do they save? Lets say they save 5 mythical 
>man-minutes per month per user. Well we have ??? thousand users, so I 
>think it'd pay forward it's time investment quick smart.

How much time do they save? That's for you to claim and substantiate. I'm 
not trying to justify or validate your project, I'm trying to repudiate it.

But consider this: By the time you complete this task, the upward march of 
system speeds (CPU and I/O) will probably have done more to improve 
elapsed-time performance of command invocation than your improvements are 
going to achieve.

Note, too, that it's not valid to measure system resources or elapsed time 
saved by adding up that saved by each individual user. Below a threshold of 
gain _per user_ your work is for naught because it is imperceptible.

And five staff-minutes per user per month? You think that's significant? 
What would you do with those five minutes spread throughout the month? 
That's right: Nothing, 'cause you'd get it in fraction-of-a-second parcels.

Lastly, you'll have to have an ongoing effort to port changes from the 
stand-alone original versions of the commands to your embedded counterparts.


> > By the way, which shell will you do this for? BASH, TCSH,
> > Ash? More than one?
>
>I'd guess at ash, as that's the smallest shell we have, but if it's easier 
>with bash, then I see no reason not to - as this would be a /bin/sh 
>replacement - if the benefits were to be realised.

How many people use such a bare-bones shell? Unless you modify them all, 
there will be a sizeable user contingent that does not benefit from your 
efforts.


> > Please feel free to prove me wrong, of course.
>
>Well, I've got to complete my review before I decide what I think of the 
>idea. Until then its just an idea. Also I don't feel the urge to 'prove 
>something' on this:}.

I think you need a good technical justification for the effort you'll 
expend relative to the benefits you're going to gain and the detriments 
you're going to incur.

As with all optimizations, you must measure the cost of the current code 
and that of replacement. In this case, you could possibly mock up a test 
jig that did DLL loading and compare that with the cost of fork / exec. But 
that would not include the unknown costs of your putative push_context / 
pop_context mechanism.

"The proof of the pudding is in the eating." So until you've done it, you 
won't know for an empirical fact if it's a win and if so how much of a win 
it is.


>Rob


Randall Schulz
Mountain View, CA USA


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
References:
- RE: OT: possible project/research project
  - From: Robert Collins
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]