This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/15065] regular expressions: subexpression capture support


https://sourceware.org/bugzilla/show_bug.cgi?id=15065

Serhei Makarov <serhei.public at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |serhei.public at gmail dot com

--- Comment #1 from Serhei Makarov <serhei.public at gmail dot com> ---
Just a heads up that I'm working on this feature, with my current code at
https://github.com/serhei/stap-experiments/commits/serhei/tnfa. I wrote a
solution based on Laurikari's TNFA algorithm.

The testsuite
(https://github.com/serhei/stap-experiments/blob/serhei/tnfa/testsuite/runok/regex_grouping.stp)
needs to be a lot more complete before I'm willing to put confidence in this
feature. Current results look like this:

regex PASS: #1: aaa =~ a* with 1 groups 'aaa' 
regex FAIL (grouping): #2: abab =~ (ab)* with 2 groups '' '' 
regex PASS: #3: cabab =~ c(ab)* with 2 groups 'cabab' 'ab' 
regex PASS: #4: aaa =~ (a*)a*a with 2 groups 'aaa' 'aa' 
regex PASS: #5: regex =~ re(gex) with 2 groups 'regex' 'gex' 
regex PASS: #6: longer =~ (long|longer) with 2 groups 'longer' 'longer' 
regex PASS: #7: unrelated !~ regex
regex PASS: #8: \ =~ \\ with 1 groups '\' 
regex PASS: #9: xabcy =~ abc with 1 groups 'abc' 
regex PASS: #10: abbbbc =~ ab*bc with 1 groups 'abbbbc' 
regex PASS: #11: abbc =~ ab?bc with 1 groups 'abbc' 
regex PASS: #12: abcc !~ ^abc$
regex PASS: #13: abd !~ a[b-d]e
regex PASS: #14: ace =~ a[b-d]e with 1 groups 'ace' 
regex PASS: #15: ab =~ a\(*b with 1 groups 'ab' 
regex PASS: #16: a((b =~ a\(*b with 1 groups 'a((b' 
regex PASS: #17: ab =~ (a+|b)* with 2 groups 'ab' 'b' 
regex PASS: #18: ab =~ (a+|b)+ with 2 groups 'ab' 'b' 
regex PASS: #19: abbbcd =~ ([abc])*d with 2 groups 'abbbcd' 'c' 
regex PASS: #20: abcde !~ ^(ab|cd)e
regex PASS: #21: abcde =~ (ab|cd)e with 2 groups 'cde' 'cd' 
regex PASS: #22: abcde =~ (ab|cd)e$ with 2 groups 'cde' 'cd' 
regex PASS: #23: alpha =~ [A-Za-z_][A-Za-z0-9_]* with 1 groups 'alpha' 
regex PASS: #24: ij =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'ij' 'ij' 'j' 
regex PASS: #25: effg !~ (bc+d$|ef*g.|h?i(j|k))
regex PASS: #26: 00effg12 =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'effg1'
'effg1' '' 
regex PASS: #27: bcccd =~ (bc+d$|ef*g.|h?i(j|k)) with 3 groups 'bcccd' 'bcccd'
'' 
regex PASS: #28: a =~ (((((((((a))))))))) with 10 groups 'a' 'a' 'a' 'a' 'a'
'a' 'a' 'a' 'a' 'a' 
regex PASS: #29: (.*)\) !~ \((.*),
regex PASS: #30: ab !~ [k]
regex PASS: #31: abcd =~ abcd with 1 groups 'abcd' 
regex PASS: #32: abcd =~ a(bc)d with 2 groups 'abcd' 'bc' 

regex total PASS: 31, FAIL: 1

-- 
You are receiving this mail because:
You are the assignee for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]