This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Cc: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, nd <nd at arm dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Thu, 5 May 2016 09:36:42 -0700
- Subject: Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]
- Authentication-results: sourceware.org; auth=none
- References: <AM3PR08MB00888058CAFD723F21D2342D837B0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <572A3B9C dot 3080803 at linaro dot org> <AM3PR08MB00882DA2CEDFC95A1BB79776837B0 at AM3PR08MB0088 dot eurprd08 dot prod dot outlook dot com> <572A6271 dot 6050802 at linaro dot org> <CAMe9rOrcKzp_bk__3iYxROBdi9=DOSw3HzrZsUrEYCSXXVtCmg at mail dot gmail dot com> <572B559B dot 5080301 at linaro dot org> <CAMe9rOrJgz6Xk6Wu9EdUp4gNn7E7zWkROH-ApZb7ZXteUcMjUg at mail dot gmail dot com> <E92E18C5-BA96-4B82-924F-516B4F5768B8 at linaro dot org>
On Thu, May 5, 2016 at 9:34 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
>> On May 5, 2016, at 11:45, H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Thu, May 5, 2016 at 7:15 AM, Adhemerval Zanella
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>>
>>>> On 05/05/2016 10:37, H.J. Lu wrote:
>>>> On Wed, May 4, 2016 at 1:58 PM, Adhemerval Zanella
>>>> <adhemerval.zanella@linaro.org> wrote:
>>>>>
>>>>>
>>>>>> On 04/05/2016 17:51, Wilco Dijkstra wrote:
>>>>>> Adhemerval Zanella wrote:
>>>>>>>
>>>>>>> But my point is all the architectures which provide an optimized mempcpy is
>>>>>>> though either 1. jump directly to optimized memcpy (s390 case for this patchset),
>>>>>>> 2. clonning the same memcpy implementation and adjusting the pointers (x86_64) or
>>>>>>> 3. using a similar strategy for both implementations (powerpc).
>>>>>>
>>>>>> Indeed, which of those are used doesn't matter much.
>>>>>>
>>>>>>> So for this change I am proposing compiler support won't be required because both
>>>>>>> memcpy and __mempcpy will be transformed to memcpy + s. Based on assumption that
>>>>>>> memcpy is fast as mempcpy implementation I think there is no need to just add
>>>>>>> this micro-optimization to only s390, but rather make is general.
>>>>>>
>>>>>> GLIBC already has this optimization in the generic string header, it's just that s390 wants
>>>>>> to do something different again. As long as GCC isn't fixed this isn't possible to support
>>>>>> s390 without this header workaround. And we need GCC to improve so things work
>>>>>> better for all the other C libraries...
>>>>>
>>>>> But the current one at string/string.h is only enabled with !defined _HAVE_STRING_ARCH_mempcpy,
>>>>> so if a port actually adds a mempcpy one it won't be enabled. What I am trying to argue it
>>>>> to just remove the !defined _HAVE_STRING_ARCH_mempcpy and enable it as default for all
>>>>> ports.
>>>>
>>>> Please don't enable it for x86. Calling memcpy means we have to
>>>> save and restore 2 registers for no good reasons.
>>>
>>> Yes, direct call will require save and restore the size for further add
>>> and this is true for most architectures. My question is if does this
>>> really matter in currently GLIBC internal usage and on programs that
>>> might use it compared against the burden of keeping the various
>>> string*.h header in check for multiple architectures or adding this
>>> logic (mempcpy transformation to memcpy) on compiler.
>>
>> What burden? There is nothing to do in glibc for x86. GCC can
>> inline mempcpy for x86.
>
> In fact I am objecting all the bits GLIBC added on string*.h that only adds complexity for some micro-optimizations. For x86 I do agree that transforming mempcpy to memcpy is no the best strategy.
>
> My rationale is to avoid add even more arch-specific bits in installed headers to add such optimizations.
I believe most of those micro-optimizations belong to GCC, not glibc.
Of course, we should keep the existing ones for older GCCs. We
should avoid adding new ones.
--
H.J.