This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Seeking consensus on BZ 16734

From: Florian Weimer <fweimer at redhat dot com>
To: Rich Felker <dalias at libc dot org>
Cc: Paul Pluzhnikov <ppluzhnikov at google dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, GLIBC Devel <libc-alpha at sourceware dot org>, Daniel Colascione <dancol at dancol dot org>, Andrew Pinski <pinskia at gmail dot com>
Date: Wed, 11 Feb 2015 15:10:14 +0100
Subject: Re: Seeking consensus on BZ 16734
Authentication-results: sourceware.org; auth=none
References: <CALoOobP_7jpdZUqSFmKCTFds6t8TTdnxfOfg2jCTr_TjvU+t2w at mail dot gmail dot com> <CAMe9rOrp6jCuPe4ZX-kdHdO_4_k-Dpf7ha-PxtCJmJLnL3K0-A at mail dot gmail dot com> <CALoOobMZFx7c+i0GCFRg1-1Z=2H3xDDH8+td-D=0k9muAFvPAA at mail dot gmail dot com> <20150202051410 dot GG23507 at brightrain dot aerifal dot cx> <54DB5A67 dot 309 at redhat dot com> <20150211140240 dot GW23507 at brightrain dot aerifal dot cx>

On 02/11/2015 03:02 PM, Rich Felker wrote:
> On Wed, Feb 11, 2015 at 02:34:31PM +0100, Florian Weimer wrote:
>> On 02/02/2015 06:14 AM, Rich Felker wrote:
>>> On Sun, Feb 01, 2015 at 08:46:06PM -0800, Paul Pluzhnikov wrote:
>>>> On Sun, Feb 1, 2015 at 8:09 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>
>>>>>> Can we just do it?
>>>>>>
>>>>>
>>>>> Do we have any current performance data on this?
>>>>
>>>> I am not sure what performance data you want.
>>>>
>>>> The application CPU will go up (calloc has to zero out space), kernel
>>>> CPU will go down (kernel would not have to zero out the same space).
>>>>
>>>> It's clear that calloc()ing 8K is much cheaper than mmap()ing,
>>>> especially when there are 100s of threads.
>>>
>>> The original idea seems to be some misguided idea that read/write
>>> should perform better with a page-aligned buffer.
>>
>> Historically, some Linux VFS read implementations could transfer the
>> data by mapping full pages (/dev/zero was one of them).  I think they
>> have been gone for a long time because you need to copy lots and lots of
>> data (certainly more than 8K) before you lose against remapping and the
>> cache invalidation that comes with it.
> 
> This seems like it would break horribly when the destination is
> anything but anonymous memory (presumably they checked that) and would
> perform atrociously bad (cost of locking vmas, possible TLB
> invalidation, etc.) especially when the size of the read is at most a
> few pages (which is the case for FILE buffers under normal usage). So
> it seems like, even if such a hack were possible, it was terribly
> misguided and would have pessimized performance for stdio.

I don't doubt that at all.  But perhaps that was the reason to
page-align the buffer.

You can see the details here:

<https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=557ed1fa2620dc119adb86b34c614e152a629a80>

(Curiously, this introduced a regression, fixed in
730c586ad5228c339949b2eb4e72b80ae167abc4.)


-- 
Florian Weimer / Red Hat Product Security

Follow-Ups:
- Re: Seeking consensus on BZ 16734
  - From: OndÅej BÃlka

References:
- Seeking consensus on BZ 16734
  - From: Paul Pluzhnikov
- Re: Seeking consensus on BZ 16734
  - From: H.J. Lu
- Re: Seeking consensus on BZ 16734
  - From: Paul Pluzhnikov
- Re: Seeking consensus on BZ 16734
  - From: Rich Felker
- Re: Seeking consensus on BZ 16734
  - From: Florian Weimer
- Re: Seeking consensus on BZ 16734
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]