This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Massive performance regression of glibc string functions
On Sat, Nov 7, 2009 at 12:58 AM, Petr Baudis <pasky@suse.cz> wrote:
> On Fri, Nov 06, 2009 at 10:20:41AM -0700, H.J. Lu wrote:
>> I am using the rdtsc timing in glibc string tests. Here is strlen data on
>>
>> Intel(R) Xeon(R) CPU ? ? ? ? ? X3350 ?@ 2.66GHz
> ..snip..
>>
>> Data on memcmp and strcmp show similar results. The new ones
>> in glibc 2.11 are much better than the old ones in glibc 2.9.
>
> I think the one you have shown exactly matches my findings - I also
> think strlen() in glibc-2.11 is much better than in glibc-2.9 (except on
> AMD and very small strings). But that is the only one of these I tested;
> could you please post the same numbers for e.g. memcmp()?
memcmp_2_11 memcmp 2.9
LAT: Len 1, alignment 13/13: 8 16
LAT: Len 1, alignment 13/13: 8 16
LAT: Len 1, alignment 13/13: 8 16
LAT: Len 2, alignment 12/12: 16 24
LAT: Len 2, alignment 12/12: 16 24
LAT: Len 2, alignment 12/12: 16 24
LAT: Len 3, alignment 10/10: 16 24
LAT: Len 3, alignment 10/10: 24 24
LAT: Len 3, alignment 10/10: 24 24
LAT: Len 4, alignment 8/ 8: 16 24
LAT: Len 4, alignment 8/ 8: 16 24
LAT: Len 4, alignment 8/ 8: 16 24
LAT: Len 5, alignment 6/ 6: 16 32
LAT: Len 5, alignment 6/ 6: 24 24
LAT: Len 5, alignment 6/ 6: 24 24
LAT: Len 6, alignment 4/ 4: 16 32
LAT: Len 6, alignment 4/ 4: 24 32
LAT: Len 6, alignment 4/ 4: 24 32
LAT: Len 7, alignment 2/ 2: 16 32
LAT: Len 7, alignment 2/ 2: 24 32
LAT: Len 7, alignment 2/ 2: 24 32
LAT: Len 8, alignment 0/ 0: 16 40
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 9, alignment 14/14: 16 56
LAT: Len 9, alignment 14/14: 24 32
LAT: Len 9, alignment 14/14: 24 32
LAT: Len 10, alignment 12/12: 16 40
LAT: Len 10, alignment 12/12: 24 40
LAT: Len 10, alignment 12/12: 24 40
LAT: Len 11, alignment 10/10: 24 48
LAT: Len 11, alignment 10/10: 24 40
LAT: Len 11, alignment 10/10: 24 40
LAT: Len 12, alignment 8/ 8: 16 48
LAT: Len 12, alignment 8/ 8: 24 40
LAT: Len 12, alignment 8/ 8: 24 40
LAT: Len 13, alignment 6/ 6: 24 48
LAT: Len 13, alignment 6/ 6: 24 40
LAT: Len 13, alignment 6/ 6: 24 40
LAT: Len 14, alignment 4/ 4: 24 56
LAT: Len 14, alignment 4/ 4: 24 48
LAT: Len 14, alignment 4/ 4: 24 48
LAT: Len 15, alignment 2/ 2: 24 56
LAT: Len 15, alignment 2/ 2: 24 48
LAT: Len 15, alignment 2/ 2: 24 48
LAT: Len 1, alignment 0/ 0: 8 16
LAT: Len 1, alignment 0/ 0: 8 16
LAT: Len 1, alignment 0/ 0: 8 16
LAT: Len 2, alignment 0/ 0: 16 24
LAT: Len 2, alignment 0/ 0: 16 24
LAT: Len 2, alignment 0/ 0: 16 24
LAT: Len 3, alignment 0/ 0: 16 24
LAT: Len 3, alignment 0/ 0: 24 24
LAT: Len 3, alignment 0/ 0: 24 24
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 5, alignment 0/ 0: 16 32
LAT: Len 5, alignment 0/ 0: 24 24
LAT: Len 5, alignment 0/ 0: 24 24
LAT: Len 6, alignment 0/ 0: 16 32
LAT: Len 6, alignment 0/ 0: 24 32
LAT: Len 6, alignment 0/ 0: 24 32
LAT: Len 7, alignment 0/ 0: 16 32
LAT: Len 7, alignment 0/ 0: 24 32
LAT: Len 7, alignment 0/ 0: 24 32
LAT: Len 8, alignment 0/ 0: 16 40
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 9, alignment 0/ 0: 16 56
LAT: Len 9, alignment 0/ 0: 24 32
LAT: Len 9, alignment 0/ 0: 24 32
LAT: Len 10, alignment 0/ 0: 16 40
LAT: Len 10, alignment 0/ 0: 24 40
LAT: Len 10, alignment 0/ 0: 24 40
LAT: Len 11, alignment 0/ 0: 24 48
LAT: Len 11, alignment 0/ 0: 24 40
LAT: Len 11, alignment 0/ 0: 24 40
LAT: Len 12, alignment 0/ 0: 16 48
LAT: Len 12, alignment 0/ 0: 24 40
LAT: Len 12, alignment 0/ 0: 24 40
LAT: Len 13, alignment 0/ 0: 24 48
LAT: Len 13, alignment 0/ 0: 24 40
LAT: Len 13, alignment 0/ 0: 24 40
LAT: Len 14, alignment 0/ 0: 24 56
LAT: Len 14, alignment 0/ 0: 24 48
LAT: Len 14, alignment 0/ 0: 24 48
LAT: Len 15, alignment 0/ 0: 24 56
LAT: Len 15, alignment 0/ 0: 24 48
LAT: Len 15, alignment 0/ 0: 24 48
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 4, alignment 0/ 0: 16 24
LAT: Len 32, alignment 0/ 0: 32 32
LAT: Len 32, alignment 13/14: 40 64
LAT: Len 32, alignment 0/ 0: 32 64
LAT: Len 32, alignment 0/ 0: 32 64
LAT: Len 8, alignment 0/ 0: 16 40
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 8, alignment 0/ 0: 24 32
LAT: Len 64, alignment 0/ 0: 40 40
LAT: Len 64, alignment 14/12: 112 88
LAT: Len 64, alignment 0/ 0: 32 72
LAT: Len 64, alignment 0/ 0: 32 104
LAT: Len 16, alignment 0/ 0: 24 32
LAT: Len 16, alignment 0/ 0: 24 56
LAT: Len 16, alignment 0/ 0: 24 56
LAT: Len 128, alignment 0/ 0: 48 56
LAT: Len 128, alignment 14/12: 144 120
LAT: Len 128, alignment 0/ 0: 40 88
LAT: Len 128, alignment 0/ 0: 40 88
>> If you believe there is a regression, please provide length as well
>> as alignments on input data. I will take a look.
>
> The lengths are the numbers after function names - i.e. I'm testing with
> 4, 8, 32 and 128. All the values are 8-aligned, I can test misaligned
> strings too if you think 2.11 will do better there.
>
Your test compares timings of 2 implementations in 2 C libraries on
2 sets of random data. You should compare 2 implementations on the
same set of data linked against the same C library.
--
H.J.