This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [regex] Fix BZ429
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>
- Cc: libc-alpha at sources dot redhat dot com
- Date: Wed, 10 Nov 2004 12:18:23 +0100
- Subject: Re: [regex] Fix BZ429
- References: <4191F700.2030707@lu.unisi.ch>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Nov 10, 2004 at 12:09:52PM +0100, Paolo Bonzini wrote:
> To fix BZ429, it is enough to do some kind of caching and cut the number
> of invocations of calc_dst_limits_pos. Its current implementation is
> naive: Complexity is exponential in N because every epsilon closure
> visits N backreferences: this without any hope of succeeding, because no
> OP_{OPEN,CLOSE}_SUBEXP node is reachable from the OP_BACK_REF nodes.
Thanks.
> The tests in bug-regex11.c now complete in a sane amount of time (11
> seconds each on a G4 PowerMac), but still not exactly small so I've not
> enabled them. Unluckily this does not speed up other testcases, but the
> profiles for these pathologic regexes look more similar to the common
> one (sift_states_bkref is at the top).
I think we should enable that test (maybe test-skeleton.c'ize it and
add some bigger TIMEOUT, like 30 or 60).
> --- orig/lib/regex_internal.h
> +++ mod/lib/regex_internal.h
> @@ -547,9 +547,9 @@ struct re_backref_cache_entry
> int str_idx;
> int subexp_from;
> int subexp_to;
> - /* We need only one byte from the following field. If other small
> - fields are added the type could be changed to 'char'. */
> - int more;
> + char more;
> + char unused;
> + short eps_reachable_subexps_map;
Wouldn't it be better to use unsigned short here when you use it as
a bitfield?
Jakub