This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] Faster memset.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Carlos O'Donell <carlos at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 15 Apr 2013 19:56:28 +0200
- Subject: Re: [RFC] Faster memset.
- References: <20130323145420 dot GA18058 at domone dot kolej dot mff dot cuni dot cz> <20130326172514 dot GA14436 at domone dot kolej dot mff dot cuni dot cz> <5164A07F dot 9090606 at redhat dot com> <20130410062509 dot GA5995 at domone dot kolej dot mff dot cuni dot cz> <5165B7C6 dot 1070803 at redhat dot com>
On Wed, Apr 10, 2013 at 03:04:38PM -0400, Carlos O'Donell wrote:
> On 04/10/2013 02:25 AM, OndÅej BÃlka wrote:
> >>> On previous test my implementations gains mostly when current
> >>> implementation computed jump is not in cache and this benchmark
> >>> underestimates this factor.
> >>
> >> So a win for one and a loss for the other.
> >>
> >> How much of a win and how much of a loss?
> >>
> > When I did profiling it supports theory that cache cost dominates and my
> > implementation is faster. Result is here.
> >
> > http://kam.mff.cuni.cz/~ondra/benchmark_string/memset_profile/result.html
>
> Nice results.
>
> > Results are slower when random test, I am not sure why.
>
> I think we need to answer that before we can move forward.
>
> I don't have any interesting ideas though.
>
Now I am about 90% sure it was because of minor page faults. When I
filtered out calls that took more than 2000 cycles then graph looks
mostly as in random test.