This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] avoid initialization overhead in strcoll


On Thu, May 28, 2015 at 05:39:01PM +0200, Leonhard Holz wrote:
> This patch removes some initialization overhead in the hot path of strcoll. It
> improves the file listing benchmark by 20% on x86.

You've removed a memset call which seems to be the reason for the 20%
overhead; I can't see anything else that could have done this, but I
haven't looked hard enough.

This change however should be split up to only include the useful
stuff.

> diff --git a/string/strcoll_l.c b/string/strcoll_l.c
> index 0fa005f..14bfb8b 100644
> --- a/string/strcoll_l.c
> +++ b/string/strcoll_l.c
> @@ -62,7 +62,6 @@ typedef struct
>    int len;			/* Length of the current sequence.  */
>    size_t val;			/* Position of the sequence relative to the
>  				   previous non-ignored sequence.  */
> -  size_t idxnow;		/* Current index in sequences.  */

This should go in as a separate change since it is valid without any
performance justification.  idxnow is unused.  Please post this as a
separate patch.

>    size_t idxmax;		/* Maximum index in sequences.  */
>    size_t idxcnt;		/* Current count of indices.  */
>    size_t backw;			/* Current Backward sequence index.  */
> @@ -313,20 +312,22 @@ STRCOLL (const STRING_TYPE *s1, const STRING_TYPE *s2,
> __locale_t l)
>    assert (((uintptr_t) extra) % __alignof__ (extra[0]) == 0);
>    assert (((uintptr_t) indirect) % __alignof__ (indirect[0]) == 0);
> 
> -  int result = 0, rule = 0;
> -
> +  int result = 0;
>    coll_seq seq1, seq2;
> -  memset (&seq1, 0, sizeof (seq1));
> -  seq2 = seq1;
> +  seq1.len = 0;
> +  seq1.idxmax = 0;
> +  seq1.rule = 0;
> +  seq2.len = 0;
> +  seq2.idxmax = 0;

I'm surprised that this actually makes a difference.  The memset
should have been optimized out and replaced with movs.  The change to
remove the RULE variable seems unrelated to me and should just be
avoided if it does not result in any interesting code change..

Can you please post a more detailed justification of what caused the
performance improvement?

Siddhesh

Attachment: pgpja21Juw9Ix.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]