This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Save and restore xmm0-xmm7 in _dl_runtime_resolve


On Tue, Jul 28, 2015 at 1:55 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Jul 27, 2015 at 6:37 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Mon, Jul 27, 2015 at 6:26 AM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>>> On Mon, Jul 27, 2015 at 06:14:07AM -0700, H.J. Lu wrote:
>>>> >>
>>>> >> There is a potential performance issue.  This won't change parameters
>>>> >> passed in S256-bit/512-bit vector registers because SSE load will only
>>>> >> update the lower 128 bits of 256-bit/512-bit vector registers while
>>>> >> preserving the upper bits.  But these SSE load operations may not be
>>>> >> fast on all current and future processors.  To load the entire
>>>> >> 256-bit/512-bit vector registers, we need to check CPU feature in
>>>> >> each symbol lookup.  On the other hand, we can compile x86-64 ld.so
>>>> >> with -msse2.  I don't know what the final performance impact is.
>>>> >>
>>>> > Yes, these should be saved due problems with modes. There could be
>>>> > problem that saving these takes longer. You don't need
>>>> > check cpu features on each call.
>>>> > Make _dl_runtime_resolve a function pointer and on
>>>> > startup initialize it to correct variant.
>>>>
>>>> One more indirect call.
>>>>
>>> no, my proposal is different, we could do this:
>>>
>>> void *_dl_runtime_resolve;
>>> int startup()
>>> {
>>>   if (has_avx())
>>>     _dl_runtime_resolve = _dl_runtime_resolve_avx;
>>>   else
>>>     _dl_runtime_resolve = _dl_runtime_resolve_sse;
>>> }
>>>
>>> Then we will assign correct variant.
>>
>> Yes, this may work for both _dl_runtime_profile and
>>  _dl_runtime_resolve.  I will see what I can do.
>>
>
> Please try hjl/pr18661 branch.  I implemented:
>
> 0000000000016fd0 t _dl_runtime_profile_avx
> 0000000000016b50 t _dl_runtime_profile_avx512
> 0000000000017450 t _dl_runtime_profile_sse
> 00000000000168d0 t _dl_runtime_resolve_avx
> 0000000000016780 t _dl_runtime_resolve_avx512
> 0000000000016a20 t _dl_runtime_resolve_sse

I enabled SSE in ld.so and it works fine.


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]