This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Save and restore xmm0-xmm7 in _dl_runtime_resolve


On Sat, Jul 11, 2015 at 4:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sat, Jul 11, 2015 at 01:27:42PM -0700, H.J. Lu wrote:
>> On Sat, Jul 11, 2015 at 12:46:54PM +0200, OndÅej BÃlka wrote:
>> > On Thu, Jul 09, 2015 at 09:07:24AM -0700, H.J. Lu wrote:
>> > > On Thu, Jul 9, 2015 at 7:28 AM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>> > > > On Thu, Jul 09, 2015 at 07:12:24AM -0700, H.J. Lu wrote:
>> > > >> On Thu, Jul 9, 2015 at 6:37 AM, Zamyatin, Igor <igor.zamyatin@intel.com> wrote:
>> > > >> >> On Wed, Jul 8, 2015 at 8:56 AM, Zamyatin, Igor <igor.zamyatin@intel.com>
>> > > >> >> wrote:
>> > > >> >> > Fixed in the attached patch
>> > > >> >> >
>> > > >> >>
>> > > >> >> I fixed some typos and updated sysdeps/i386/configure for
>> > > >> >> HAVE_MPX_SUPPORT.  Please verify both with HAVE_MPX_SUPPORT and
>> > > >> >> without on i386 and x86-64.
>> > > >> >
>> > > >> > Done, all works fine
>> > > >> >
>> > > >>
>> > > >> I checked it in for you.
>> > > >>
>> > > > These are nice but you could have same problem with lazy tls allocation.
>> > > > I wrote patch to merge trampolines, which now conflicts. Could you write
>> > > > similar patch to solve that? Original purpose was to always save xmm
>> > > > registers so we could use sse2 routines which speeds up lookup time.
>> > >
>> > > So we will preserve only xmm0 to xmm7 in _dl_runtime_resolve? How
>> > > much gain it will give us?
>> > >
>> > I couldn't measure that without patch. Gain now would be big as we now
>> > use byte-by-byte loop to check symbol name which is slow, especially
>> > with c++ name mangling. Would be following benchmark good to measure
>> > speedup or do I need to measure startup time which is bit harder?
>> >
>>
>> Please try this.
>>
>
> We have to use movups instead of movaps due to
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066
>

It won't work since stack may be misaligned when SSE2 functions
are called.  Please try glibc hjl/pr18661 branch, where I fixed a few
stack misalignment bugs in x86-64 assembly code and save/restore
xmm0-xmm7,  with GCC hjl/pr58066/gcc-5-branch


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]