This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: unbuffered fread() deadlock


Craig,

I am pretty sure that's where this came from.

Now, that said. Why is it there? From my old IBM C/370 library days I seem to remember this
was meant for terminal I/O. Consider the following test case:


#include <stdio.h>
#include <string.h>

int main() {
   int a;
   printf("a is ");
   scanf("%d", &a);
   printf("b is ");
   fprintf(stderr, "c is 5\n");
   return 0;
}

This test when run under newlib and glibc runs as follows (where I enter the number 3)

a is 3
c is 5
b is <prompt>

Note how the stdout buffer gets flushed before the input for a is given. If those lines are removed
you won't see the "a is" being output before the input is requested. Again, note that this behavior
is the same under glibc.


Now, should all streams be flushed? The answer is no. Only the terminal I/O.

Modifying the test case to write to a file that is line-buffered has a different result on glibc than on newlib.
Newlib flushes the file when the terminal I/O is requested, but glibc does not.


So, what I believe should be happening is that if terminal I/O is requested from the host, then all line-buffered terminal I/O should be flushed. I believe the furthermore clause relates to line-buffered I/O since the previous sentence is talking about line-buffered I/O but I can always test this against glibc.

-- Jeff J.

Howland Craig D (Craig) wrote:
Jeff:
I don't read that the furthermore says anything like what the
comment does. (I had read it while composing my response--more than
once--and it never occurred to me that it might be the source of the
refill.c comment. Perhaps you are right, and the 'furthermore' was the
source of the refill.c comment, but if it is, I don't think that it is
a correct understanding.) I'll annotate it a little as I understand it.
Original:
"Furthermore, characters are intended to be transmitted as a block to
the host environment when a buffer is filled, when input is requested
on an unbuffered stream, or when input is requested on a line buffered
stream that requires the transmission of characters from the host
environment."
Re-format their long sentence, and add 5 annotations:
"Furthermore, characters are intended to be transmitted as a block to
the host environment when[:]
[1] a buffer is filled, [or]
[2] when input is requested on an unbuffered stream, or [3] when input is requested on a line buffered stream that requires
the transmission of characters from the host environment."
I think that what they're trying to say (not very clearly) could
perhaps be paraphrased as 'it is intended that whenever characters
do need to be moved in a stream--regardless of buffering mode--that
these moves should use blocks of characters as possible for the sake of
efficiency.' (The second point might not appear to make any sense--how
can you move a block on an un-buffered stream?--but it can if the device
has a buffer. A device having its own buffer is a good reason to not
have
the file stream add one. So a read on a non-stream-buffered Ethernet
MAC
could get a block of bytes from the MAC's FIFO, for example.)
It appears that you're saying that the statement in point 3, "...
that requires the transmission of characters from the host environment"
means that point 3 as a whole should be understood as (in words more
like
the disputed refill.c comment) something like: 'when input is requested
on any line-buffered fd, the output buffers of all fds must be flushed.'
(That is, 1) you're proposing that point 3 is the source of the comment,
and, 2) you think that it can understood that way.)
I don't think that this follows. In addition to the arguments
already given, the last clause of the last sentence in your quote points
out that setvbuf can affect this behavior. Nothing in the description
for
setvbuf says anything akin to the comment's wording, that is, that would
link one stream to another stream, or even input to output.
I actually stumbled across a document this morning that gives
rationales behind stuff in the C99 standard. (See
http://www.open-std.org/JTC1/SC22/WG14/www/docs/C99RationaleV5.10.pdf.)
In its section 7.19.3 Files, it says: "The distinction between buffered
and unbuffered streams suggests the desired interactive behavior; but an
implementation may still be conforming even if delays in a network or
terminal
controller prevent output from appearing in time. It is the intent that
matters here."
The clearly-stated intents in 7.19.3 (which are all in the quote
that you supplied)--as opposed to the unclear intent in the furthermore
sentence--are all aimed at getting characters back and forth in an
expeditious manner. (When unbuffered, as soon as possible; when
fully-buffered, as soon as the buffer is full; when line-buffered, as
soon as the line is complete. And, by implication, a partial buffer as
soon as a flush of it is requested (whether directly via a user fflush
or indirectly via close or exit or abort).)
Perhaps the strongest statement in favor of my interpretation comes
from the Rationale document section 7.19.5.2 regarding fflush:
"The fflush function ensures that output has been forced out of internal
I/O buffers for a specified stream. Occasionally, however, it is
necessary to ensure that all output is forced out, and the programmer
may not conveniently be able to specify all the currently open streams,
perhaps because some streams are manipulated within library packages.9
To provide an implementation-independent method of flushing all output
buffers, the Standard specifies that
this is the result of calling fflush with a NULL argument. [footnote
9:] For instance, on a system (such as UNIX) which supports process
forks, it is usually necessary to flush all output buffers just prior to
the fork."
(Since there is an explicitly-provided mechanism for the "occasional"
need
to flush all output streams, why would there be a bizarre implied back
door method to do so?)
Furthermore, from the rationale for fopen:
"A change of input/output direction on an update file is only allowed
following a successful fsetpos, fseek, rewind, or fflush operation,
since these are precisely the functions which assure that the I/O buffer
has been flushed."
(If they intended a read to be able to do so, they failed to mention
it.)
And a final argument from the Rationale document. In their setvbuf
discussion, they say nothing of linking streams nor directions. (They
actually allow FBF to be implemented as LBF (always), or LBF as NBF for
a binary file. That is, even though 3 buffering methods are defined,
an implementation can provide fewer methods, "The general principle is
to provide portable code with a means of requesting the most appropriate
popular buffering style, but not to require an implementation to support
these styles.") Again, nothing at all that even hints at the intent
from the refill.c comment.
And if I didn't convince you yet, in the first half of the last
quoted sentence (i.e. that you quoted from C99 7.19.3), it points out
that
the behavior of the different buffering modes is implementation-defined.
So even if the "furthermore" sentence were intended to be understood as
the refill.c comment says (which I dispute), the implementation can
choose
to do as it sees fit. I submit that the flushing behavior as done does
not
make any sense, and is inefficient. It therefore should be excised as
being not consistent with the goals of the implementation (namely small
and efficient).
I apologize for this being so lengthy.
Craig


P.S.   If I did not convince you, I submit in advance to your judgement
as the owner and will not take any more of your time with further
argument (unless you chose to extend the discussion, of course).

-----Original Message-----
From: Jeff Johnston [mailto:jjohnstn@redhat.com] Sent: Monday, January 12, 2009 5:07 PM
To: Howland Craig D (Craig)
Cc: newlib@sourceware.org; Andre Heider
Subject: Re: unbuffered fread() deadlock


I believe the source of it is from C99 7.19.3.  Note the furthermore
clause:

"When a stream is unbuffered, characters are intended to appear from the

source or at the destination
as soon as possible. Otherwise characters may be accumulated and transmitted to or from the host
environment as a block. when a sgtream is fully buffered, characters are intended to be transmitted to
or from the host environment as a block when a buffer is filled. When a


stream is line buffered, characters
are intended to be transmitted to or from the host environment as a block when a new-line character is
encountered. Furthermore, characters are intended to be transmitted as a block to the host environment
when a buffer is filled, when input is requested on an unbuffered stream, or when input is requested on
a line buffered stream that requires the transmission of characters from


the host environment.  Support
for these characteristics is implementation-defined, and may be affected

via the setbuf and setvbuf functions."

The removal of the current stream from the list of fps to walk will solve Andre's problem. This will occur if we just remove the fp-lock/fp-unlock from fwalk since the lflush function being called for each fp won't call fflush for the current stream that is reading anyway. However, we still have the sfp-lock vs fp-lock problem looming.


A read will lock the fp and then possibly need to acquire the sfp lock.


If something else is doing an fwalk as well (e.g. fflush(null) at same time), the 2nd fwalk may wait for the read fp to be unlocked and never give up the sfp lock that the read fp sits waiting for.

-- Jeff J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]