This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 0/2] Multiarch hooks for memcpy variants


On Wed, Aug 16, 2017 at 8:28 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Zack Weinberg wrote:
>>
>> Last time we had this argument, someone (Ondrej?) claimed that the
>> overhead of going through an ifunc for intra-libc calls (specifically
>> to memcpy, IIRC) was dwarfed by the I-cache costs of having both the
>> generic and the targeted version of the function get used. I would
>> really like to see measurements addressing that specific point.
>
> I think it might be more easily measured if we make the effect much worse,
> for example by adding several KB of NOPs at entry of generic memcpy.

I think this needs to be an A/B test of the real code before and after
the real proposed change (i.e. sending intra-libc calls to memcpy
through the PLT and the ifuncs) in order to resolve the argument to
everyone's satisfaction.  `perf`, looking specifically at all levels
of cache misses, ought to be able to pick out the signal even without
an artificial penalty.

> I could easily generate a trace of internal calls to memcpy, however the key
> question is which functions in GLIBC use memcpy in performance critical
> ways and which applications make heavy use of those?

I don't know.  Maybe start with whole-program tests on big complicated
applications like Firefox and LibreOffice?  Web and database servers
might also be interesting.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]