This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v1.1][BZ #15362] Correctly return bytes written out in _IO_new_file_write

From: Siddhesh Poyarekar <siddhesh dot poyarekar at gmail dot com>
To: Eric Biggers <ebiggers3 at gmail dot com>
Cc: Siddhesh Poyarekar <siddhesh at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, "Carlos O'Donell" <carlos at systemhalted dot org>, "Joseph S. Myers" <joseph at codesourcery dot com>
Date: Mon, 15 Apr 2013 08:21:40 +0530
Subject: Re: [PATCH v1.1][BZ #15362] Correctly return bytes written out in _IO_new_file_write

On 15 April 2013 05:24, Eric Biggers <ebiggers3@gmail.com> wrote:
> to_do == 0 in that code block implies must_flush > 0.  And must_flush is only
> set greater than zero in conditional code executed if the FILE is line-buffered,
> the stream was already in put mode, and the data fully fits in the buffer.  So
> the EOF return only happened in that case.  Certainly, the _IO_OVERFLOW() could
> fail in cases where to_do > 0 and must_flush == 0, but then the return value
> will be (n - to_do).  If (to_do == n), then nothing has been written, not even
> to the buffer, and the return value (before your fix to 11741) was 0, which I
> believe is correct, as 0 bytes were successfully written.  Your patch to fix
> 11741 had changed the code to return EOF in this case, but it wasn't clear to me
> why this was done.

Right, I missed the fact that I had changed it.

>> It is a bug.  It should return the full count for fwrite in this case.
>
> I'm not really convinced that returning a full count in this case is the correct
> behavior for line-buffered FILEs.  If I use fwrite() to write data containing a
> newline to a line-buffered FILE, and fwrite() returns a count that indicates it
> has successfully written data up to and including a given newline character,
> then I would expect that, because the FILE is line-buffered, the corresponding
> data up to this newline character will have been flushed with sys_write().  Is
> there any prior discussion on this that could be referred to?  The only thing I
> can think of is that since fwrite() is often used for binary data and not text,
> the C library need not guarantee that line-buffering semantics are respected,
> although as a programmer I wouldn't really expect this behavior.  Also, the C99
> standard implies fwrite() is supposed to behavior consistent with calling
> fputc() for each byte.  And fputc('\n', fp) is supposed to flush a line-buffered
> stream, is it not?

Right, again I overlooked the fact that it ought to behave as if it
called fputc on each object.

> No, in the version following your patch to 11741, EOF is only returned if to_do
> == 0 or to_do == n.  If some, but not all, of the input data fits into the FILE
> buffer, then 'count' is set to the remaining space and 'to_do' will be
> decremented by 'count', leaving 'to_do' at neither 0 nor n, so upon overflow
> failure the return value will be the number of bytes buffered.  My
> interpretation is that this violates line-buffering semantics if the buffered
> data contains newlines, as the code returns that the data was successfully
> written, but the line(s) were not actually flushed with sys_write().  Here is a
> test program that returns 0 if and only if the C library handles this case the
> way I think it should:

Right.

> You may have brought up another relevant point, which is that _IO_OVERFLOW()
> only returns 0 or EOF, never a partial write, even though it may have done a
> partial write.  Let's consider what this means.  For unbuffered FILEs this is
> irrelevant because there is never any buffered data to flush.  For fully
> buffered FILEs this is also mostly irrelevant because we consider all buffered
> data to have been "written"; we only need to know whether the flush succeeded or
> not so we can correctly return a short count if there is more to write.  It may,
> however, be desirable for the FILE to adjust its internal state so that exactly
> the unwritten bytes are buffered, rather than the current behavior in
> new_do_write() of resetting the buffer even if it has not all been successfully
> flushed.  This would mean that the following program would print the buffer of
> 'A's when the final fputc() executes, rather than the current behavior of
> discarding them on the fputc() that errors.  The current behavior in this case
> may be very much intentional, but I'm just making sure.

When the write fails, it sets ERR_SEEN.  Here, a zero or partial write
is implied to be either an error or EOF, so the application is
expected to call ferror or feof to check which one it is.

> Anyway, for line buffered FILEs, I believe we do need the partial write count
> from _IO_OVERFLOW().  For example, someone could write multiple lines of data
> in one call to fputs() or fwrite().  If flushing the stream results in a partial
> write but the first line was fully written with sys_write(), then I believe the
> expected return value would be the total number of bytes successfully written
> with sys_write(), minus any preceding bytes in the first line that were already
> buffered before the fputs() or fwrite() call, plus any additional bytes that are
> still buffered and are part of the next line to be written with sys_write().  No
> version of the code we are considering appears to have this behavior, as it is
> not even supported by the semantics of _IO_OVERFLOW().

I'm not sure.  Shouldn't it just be limited to the last newline that
gets flushed and hence not include the buffer up to the next newline?
Everything after that, even if buffered, should be regarded as not
written to be consistent of our return value of 0 when a single line
failed to flush.


> I do not have a FSF copyright assignment in place yet, but I will do that if

Great.  Carlos, Joseph, could you guys please help with this?

> needed.  I think that after the revert, bug 1174 would be fixed just by
> changing PAD() in the way I suggested.  The other issues I have brought up, such

OK, I'll post the patch for the revert to close 15362 and then reopen
1174.  Once that is done, you can post an updated patch for pr#1174.

> as line buffering semantics not always being respected in error paths could be
> considered separate bugs.  But before I can change any other code I think we
> need to agree on what the behavior should be in the cases I've brought up, and
> also whether the corresponding code should be changed in the equivalent
> functions in wfileops.c and oldfileops.c.

Yes for wfileops.c but no for oldfileops.c.  oldfileops.c is there to
preserve binary compatibility for applications linked to glibc
versions.  I believe the user here is old versions of libstdc++.

Siddhesh
--
http://siddhesh.in

Follow-Ups:
- Re: [PATCH v1.1][BZ #15362] Correctly return bytes written out in _IO_new_file_write
  - From: Siddhesh Poyarekar
- Re: [PATCH v1.1][BZ #15362] Correctly return bytes written out in _IO_new_file_write
  - From: Eric Biggers

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]