This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] Porting string performance tests into benchtests
- From: David Miller <davem at davemloft dot net>
- To: roland at hack dot frob dot com
- Cc: siddhesh at redhat dot com, libc-alpha at sourceware dot org
- Date: Thu, 11 Apr 2013 18:46:41 -0400 (EDT)
- Subject: Re: [RFC] Porting string performance tests into benchtests
- References: <20130411205335 dot 32E3D2C091 at topped-with-meat dot com> <20130411 dot 174115 dot 1039950146327400402 dot davem at davemloft dot net> <20130411222904 dot 0ABD32C08F at topped-with-meat dot com>
From: Roland McGrath <roland@hack.frob.com>
Date: Thu, 11 Apr 2013 15:29:04 -0700 (PDT)
>> If you use CLOCK_THREAD_CPUTIME_ID, the cost of the measurement
>> exceeds the cost of the thing that you're measuring.
>
> It's only appropriate for an aggregate measurement of a large number of
> iterations. That doesn't make it useless.
And if I'm trying to measure whether loop setup instructions consume 3
vs. 4 cycles, how do I distinguish that from the highly variable cost
of the measurement call itself, which has a cost which is several
orders of magnitude more than what I'm measuring?
Such a call is not appropriate for cycle level performance analysis of
assembler, and that's what people f.e. use the string performance
numbers for.
I also keep hearing all of this noise about how scheduling disturbs
the numbers, but in practice I never see this. That's because nothing
else is running on my system when I do performance measurements.
And that's something you need to make sure of anyways to get clean
numbers, otherwise the kernel could run threads on the cpu sharing the
CPU core with the cpu you're doing the measurements on.
So there is really no argument for not using a cycle counter as long
as it is stable.