This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
- From: Roland McGrath <roland at hack dot frob dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 18 Mar 2016 15:52:58 -0700 (PDT)
- Subject: Re: [PATCH] libio: Always use _IO_BUFSIZE for stream buffers [BZ #4099]
- Authentication-results: sourceware.org; auth=none
- References: <56E17C8E dot 1070209 at redhat dot com> <20160311215230 dot B5AF32C3C1E at topped-with-meat dot com> <56E69B9D dot 3000808 at redhat dot com>
> On 03/11/2016 10:52 PM, Roland McGrath wrote:
> > Justify with clear rationale.
>
> It fixes bug 4099. We need an arbitrary limit for that.
That is justification for imposing an arbitrary maximum on the
automatically-chosen size. Similar logic on the other side of the coin is
justification for imposing an arbitrary minimum on the automatically-chosen
size. Neither is justification for always using a single fixed size.
> The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
> more consistent across the system.
That's an internal implementation choice in libstdc++. There is no reason
to expect it to stay the same, nor special reason to think that just
because libstdc++ chose it that it's ideal.
> The PostgreSQL people did extensive benchmarks to determine their
> block/page size, and settled for a 8192 (but they do not use stdio
> streams, for obvious reasons).
That's lovely. They can inform the implementors of whatever filesystem(s)
they were using in their benchmarks that st_blksize=8192 is what they
should be reporting.
> <stdio.h> documents BUFSIZ as the default buffer size. The new
> implementation matches that.
It's the default in the sense that it's what setbuf uses. So it's a
permanent part of the ABI and therefore can't be changed easily regardless
of whether it's a desireable value. If the comments or other documentation
are unclear as to the true (very tiny) significance of BUFSIZ, they should
be fixed.
> Additional memory consumption is limited because file descriptors are a
> scarce resource.
There is no reason to consider file descriptors scarce.
The per-process limit is fungible.
> I can do some benchmarking, but I don't expect any compelling results.
Whatever the results, they would not IMHO be relevant here.
POSIX specifies that st_blksize is the "preferred I/O block size for this
object". It's the kernel's responsibility to give userland good advice
through this channel. If there are common buggy kernels that give bad
advice, that is a reason to apply upper and lower limits to the advice from
the kernel. But the expectation should be that the kernel gets fixed to
give good advice, and the optimal thing to do with a good kernel is to
follow its advice.
Since the recommended use of st_blksize in this way is a standard user
feature and not just what stdio's implementation happens to do, there is an
argument to be made that the limiting of the value should be done in the
*stat functions reported st_blksize values rather than in stdio's use of
them. (I'm ambivalent about this point.)
Thanks,
Roland