This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] Single threaded stdio optimization
On 29/06/17 13:12, Carlos O'Donell wrote:
> On 06/29/2017 08:01 AM, Siddhesh Poyarekar wrote:
>> On Thursday 29 June 2017 05:11 PM, Siddhesh Poyarekar wrote:
>>> The patch looks OK except for the duplication (and a missing comment
>>> below), which looks a bit clumsy. How about something like this instead:
>>>
>>> bool need_lock = _IO_need_lock (fp);
>>>
>>> if (need_lock)
>>> _IO_flockfile (fp);
>>> result = _IO_ferror_unlocked (fp);
>>> if (need_lock)
>>> _IO_funlockfile (fp);
>>>
>>> return result;
>>>
>>> You could probably make some kind of a macro out of this, I haven't
>>> looked that hard.
>>
>> I forgot that Torvald had commented (off-list, the thread broke somehow)
>> that it would be important to try and measure how much worse this makes
>> the multi-threaded case worse.
>
> +1
>
> If we are going to optimize the single threaded case we need to know what
> impact this has on the multi-threaded case.
>
note that this impacts multi-threaded case less
than the lowlevellock approach that is currently
implemented: that adds two checks, my code does
one, that loads __libc_multiple_threads twice,
mine checks a flag in fp, which is in a cache line
that is most likely already accessed by the rest
of the io code.
i cannot produce numbers immediately as the last
time i measured this, adding a dummy thread via an
ldpreloaded lib had more effect on the timing
of the same binary than adding a branch in the
stdio code (i'm not sure why the additional thread
affects timing so much with the current code, it
might be a cpu issue, e.g. cache aliasing caused
by slightly different layout of the loaded libs,
but it also shows that the effect of the patch
is small)