This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 2/4] Improve generic strspn performance
- From: "Tulio Magno Quites Machado Filho" <tuliom at linux dot vnet dot ibm dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, libc-alpha at sourceware dot org
- Cc:
- Date: Tue, 29 Mar 2016 17:32:07 -0300
- Subject: Re: [PATCH 2/4] Improve generic strspn performance
- Authentication-results: sourceware.org; auth=none
- References: <1459178389-14133-1-git-send-email-adhemerval dot zanella at linaro dot org> <1459178389-14133-3-git-send-email-adhemerval dot zanella at linaro dot org>
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
> As for strcspn, this patch improves strspn performance using a much
> faster algorithm. It first constructs a 256-entry table based on
> the accept string and then uses it as a lookup table for the
> input string. As for strcspn optimization, it is generally at least
> 10 times faster than the existing implementation on bench-strspn
> on a few AArch64 implementations.
>
> Also the string/bits/string2.h inlines make no longer sense, as current
> implementation will already implement most of the optimizations.
>
> Tested on x86_64, i686, and aarch64.
And on powerpc64 and powerpc64le.
Git reported some trailing whitespaces in this patch.
> diff --git a/string/strspn.c b/string/strspn.c
> index f0635c1..0547f41 100644
> --- a/string/strspn.c
> +++ b/string/strspn.c
> @@ -25,23 +25,49 @@
> /* Return the length of the maximum initial segment
> of S which contains only characters in ACCEPT. */
> size_t
> -STRSPN (const char *s, const char *accept)
> +STRSPN (const char *str, const char *accept)
> {
> - const char *p;
> - const char *a;
> - size_t count = 0;
> -
> - for (p = s; *p != '\0'; ++p)
> - {
> - for (a = accept; *a != '\0'; ++a)
> - if (*p == *a)
> - break;
> - if (*a == '\0')
> - return count;
> - else
> - ++count;
> + if (accept[0] == '\0')
> + return 0;
> + if (accept[1] == '\0')
> + {
Here.
> + const char *a = str;
> + for (; *str == *accept; str++);
> + return str - a;
> }
>
> - return count;
> + /* Use multiple small memsets to enable inlining on most targets. */
> + unsigned char table[256];
> + unsigned char *p = memset (table, 0, 64);
> + memset (p + 64, 0, 64);
> + memset (p + 128, 0, 64);
> + memset (p + 192, 0, 64);
> +
> + unsigned char *s = (unsigned char*) accept;
> + /* Different from strcspn it does not add the NULL on the table
> + so can avoid check if str[i] is NULL, since table['\0'] will
> + be 0 and thus stopping the loop check. */
> + do
> + p[*s++] = 1;
> + while (*s);
> +
> + s = (unsigned char*) str;
> + if (!p[s[0]]) return 0;
> + if (!p[s[1]]) return 1;
> + if (!p[s[2]]) return 2;
> + if (!p[s[3]]) return 3;
> +
Here.
> + s = (unsigned char *) ((size_t)(s) & ~3);
> + unsigned int c0, c1, c2, c3;
And here.
--
Tulio Magno