This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]


> On 03/11/2016 10:52 PM, Roland McGrath wrote:
> > Justify with clear rationale.
> 
> It fixes bug 4099.  We need an arbitrary limit for that.

That is justification for imposing an arbitrary maximum on the
automatically-chosen size.  Similar logic on the other side of the coin is
justification for imposing an arbitrary minimum on the automatically-chosen
size.  Neither is justification for always using a single fixed size.

> The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
> more consistent across the system.

That's an internal implementation choice in libstdc++.  There is no reason
to expect it to stay the same, nor special reason to think that just
because libstdc++ chose it that it's ideal.

> The PostgreSQL people did extensive benchmarks to determine their
> block/page size, and settled for a 8192 (but they do not use stdio
> streams, for obvious reasons).

That's lovely.  They can inform the implementors of whatever filesystem(s)
they were using in their benchmarks that st_blksize=8192 is what they
should be reporting.

> <stdio.h> documents BUFSIZ as the default buffer size. The new
> implementation matches that.

It's the default in the sense that it's what setbuf uses.  So it's a
permanent part of the ABI and therefore can't be changed easily regardless
of whether it's a desireable value.  If the comments or other documentation
are unclear as to the true (very tiny) significance of BUFSIZ, they should
be fixed.

> Additional memory consumption is limited because file descriptors are a
> scarce resource.

There is no reason to consider file descriptors scarce.
The per-process limit is fungible.

> I can do some benchmarking, but I don't expect any compelling results.

Whatever the results, they would not IMHO be relevant here.

POSIX specifies that st_blksize is the "preferred I/O block size for this
object".  It's the kernel's responsibility to give userland good advice
through this channel.  If there are common buggy kernels that give bad
advice, that is a reason to apply upper and lower limits to the advice from
the kernel.  But the expectation should be that the kernel gets fixed to
give good advice, and the optimal thing to do with a good kernel is to
follow its advice.  

Since the recommended use of st_blksize in this way is a standard user
feature and not just what stdio's implementation happens to do, there is an
argument to be made that the limiting of the value should be done in the
*stat functions reported st_blksize values rather than in stdio's use of
them.  (I'm ambivalent about this point.)


Thanks,
Roland


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]