This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] More speedups, this time in build_trtable


> > After Jakub's patch that disabled transit_state_sb, build_trtable
>
> transit_state_sb removal has nothing to do with this FYI.

You're right, I was going by memory.

> > The result is a 20% improvement in the time needed to run
> > the sed testsuite on my machine, with a top improvement of 2x on
> > madding.sed (which is a search for a very long string).

Ehm, in cut&paste I lost that the 20% improvements are in the time spent in
build_trtable, while the total improvement is around 5%.  Instead for
madding.sed build_trtable is cut by 8 times and 2x is the total improvement.

> In what locale?

POSIX, it_IT@euro.ISO-8859-1 and en_US.UTF-8 all give a similar improvement,
both for madding.sed and in general.

> Will not your patch set word_trtable = 1 even when without the patch it
> would be clear?  E.g. with a(\b[abc]|[def]).

Yes, indeed I assumed that dests_ch was never empty.  But I consider this
regex as broken as a\bc and I'd rather not care much about its performance.

> Perhaps it would be enough if just
> group_nodes_into_DFAstates never created empty dests_ch[]
> bitsets.

Does it?  There are three places where it sets a dests_ch.  New items are
set from "remains", and old items can be changed to "intersec", but these
two are done only if both of "not_subset" (the OR of all the items in
"remains") and "has_intersec" (the OR of all items in INTERSEC) are
non-zero.  So the main for loop never makes zero dests_ch bitsets.

The third bitset_copy is from "accepts" if j == ndests, but this implies
that "not_consumed" (the OR of all the items in "accepts" is non-zero,
otherwise it would have done a "break" out of the loop and j would have been
< ndests.  So this seems non-zero to me as well; but maybe I'm missing
something in the code.

Thanks,

Paolo



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]