This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 0/2] Multiarch hooks for memcpy variants


On Mon, Aug 14, 2017 at 6:36 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Zack Weinberg wrote:
>> On Fri, Aug 11, 2017 at 2:53 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
>> > On Friday 11 August 2017 11:36 PM, Zack Weinberg wrote:
>>>>> may be the generic __memcpy_chk should call the ifunced
>>>>> memcpy so it goes through an extra plt indirection, but
>>>>> at least less target specific code is needed.
>>>>
>>>> I was thinking of making this suggestion myself.  I think that would
>>>> be a better maintainability/efficiency tradeoff.  (Of course, I also
>>>> think we shouldn't bypass ifuncs for intra-libc calls.)
>>>
>>> That was my initial approach, but I was under the impression that PLTs
>>> in internal calls were frowned upon, hence the ifuncs similar to what's
>>> done in x86.  If this is acceptable, I could do more tests to check
>>> gains within the library if we were to call memcpy via ifunc.
>>
>> There's been a bunch of inconclusive arguments about this in the past.
>> If you have the time and the resources to do some thorough testing and
>> properly resolve the question, that would be really great.
>
> I don't believe you can resolve this generally, it's highly dependent on the details.
> If the generic implementation is very efficient, the possible gain of specialized
> ifuncs may be so low that it can never offset the overhead of an ifunc. Also note
> that you're always slowing down the generic case, so if that version is used in
> many cases, an ifunc wouldn't make sense.
>
> I haven't looked in detail at memcpy use in GLIBC, however if the statistics are
> similar to typical use I measured then it makes no sense to use ifuncs. Large
> copies can benefit from special tweaks, and in that case the overhead of an ifunc
> would be much smaller (both relatively and absolutely due to lower frequency), so
> that's where an ifunc might be useful.

Last time we had this argument, someone (Ondrej?) claimed that the
overhead of going through an ifunc for intra-libc calls (specifically
to memcpy, IIRC) was dwarfed by the I-cache costs of having both the
generic and the targeted version of the function get used. I would
really like to see measurements addressing that specific point.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]