This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Seeking consensus on BZ 16734
- From: Florian Weimer <fweimer at redhat dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: Paul Pluzhnikov <ppluzhnikov at google dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, GLIBC Devel <libc-alpha at sourceware dot org>, Daniel Colascione <dancol at dancol dot org>, Andrew Pinski <pinskia at gmail dot com>
- Date: Wed, 11 Feb 2015 15:10:14 +0100
- Subject: Re: Seeking consensus on BZ 16734
- Authentication-results: sourceware.org; auth=none
- References: <CALoOobP_7jpdZUqSFmKCTFds6t8TTdnxfOfg2jCTr_TjvU+t2w at mail dot gmail dot com> <CAMe9rOrp6jCuPe4ZX-kdHdO_4_k-Dpf7ha-PxtCJmJLnL3K0-A at mail dot gmail dot com> <CALoOobMZFx7c+i0GCFRg1-1Z=2H3xDDH8+td-D=0k9muAFvPAA at mail dot gmail dot com> <20150202051410 dot GG23507 at brightrain dot aerifal dot cx> <54DB5A67 dot 309 at redhat dot com> <20150211140240 dot GW23507 at brightrain dot aerifal dot cx>
On 02/11/2015 03:02 PM, Rich Felker wrote:
> On Wed, Feb 11, 2015 at 02:34:31PM +0100, Florian Weimer wrote:
>> On 02/02/2015 06:14 AM, Rich Felker wrote:
>>> On Sun, Feb 01, 2015 at 08:46:06PM -0800, Paul Pluzhnikov wrote:
>>>> On Sun, Feb 1, 2015 at 8:09 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>
>>>>>> Can we just do it?
>>>>>>
>>>>>
>>>>> Do we have any current performance data on this?
>>>>
>>>> I am not sure what performance data you want.
>>>>
>>>> The application CPU will go up (calloc has to zero out space), kernel
>>>> CPU will go down (kernel would not have to zero out the same space).
>>>>
>>>> It's clear that calloc()ing 8K is much cheaper than mmap()ing,
>>>> especially when there are 100s of threads.
>>>
>>> The original idea seems to be some misguided idea that read/write
>>> should perform better with a page-aligned buffer.
>>
>> Historically, some Linux VFS read implementations could transfer the
>> data by mapping full pages (/dev/zero was one of them). I think they
>> have been gone for a long time because you need to copy lots and lots of
>> data (certainly more than 8K) before you lose against remapping and the
>> cache invalidation that comes with it.
>
> This seems like it would break horribly when the destination is
> anything but anonymous memory (presumably they checked that) and would
> perform atrociously bad (cost of locking vmas, possible TLB
> invalidation, etc.) especially when the size of the read is at most a
> few pages (which is the case for FILE buffers under normal usage). So
> it seems like, even if such a hack were possible, it was terribly
> misguided and would have pessimized performance for stdio.
I don't doubt that at all. But perhaps that was the reason to
page-align the buffer.
You can see the details here:
<https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=557ed1fa2620dc119adb86b34c614e152a629a80>
(Curiously, this introduced a regression, fixed in
730c586ad5228c339949b2eb4e72b80ae167abc4.)
--
Florian Weimer / Red Hat Product Security