This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 09/10] i386: Replace assembly versions of e_log2f with generic e_log2f.c
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, nd <nd at arm dot com>
- Date: Fri, 20 Oct 2017 08:00:08 -0700
- Subject: Re: [PATCH 09/10] i386: Replace assembly versions of e_log2f with generic e_log2f.c
- Authentication-results: sourceware.org; auth=none
- References: <20171019173159.21402-1-hjl.tools@gmail.com> <20171019173159.21402-10-hjl.tools@gmail.com> <CAMe9rOqwmBj8zFd1FiVqQR=+sOz8qQwz3RhnDpnmPEZxiw290Q@mail.gmail.com> <59EA0E54.5080204@arm.com>
On Fri, Oct 20, 2017 at 7:55 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> On 19/10/17 20:51, H.J. Lu wrote:
>> On Thu, Oct 19, 2017 at 10:31 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> This patch replaces i386 assembly versions of e_log2f with generic
>>> e_log2f.c. For workload-spec2017.wrf, on Nehalem, it improves
>>> performance by:
>>>
>>> Before After Improvement
>>> reciprocal-throughput 92.3845 30.8752 199%
>>> latency 112.855 54.8645 105%
>>>
>>> On Skylake, it improves performance by:
>>>
>>> Before After Improvement
>>> reciprocal-throughput 98.7488 22.7507 334%
>>> latency 118.01 51.6083 128%
>>
>> On IvyBridge with --disable-multi-arch, it improves performance by:
>>
>> Before After Improvement
>> reciprocal-throughput 106.635 28.8596 269%
>> latency 129.888 56.9187 128%
>>
>
> is this comparing x87 c code with x87 asm?
Yes. I double checked. There are no SSE instructions in libm.so.
> or did the toolchain have -fpmath=sse?
>
> i would not expect that much speedup on i386
>
--
H.J.