After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 745542 - Error in Regexp evaluation with branch and repetition
Error in Regexp evaluation with branch and repetition
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: regexp
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2015-03-03 16:41 UTC by Arne Becker
Modified: 2021-07-05 13:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch for correct treatment of quantifiers on atoms (1.04 KB, patch)
2015-03-31 14:07 UTC, Arne Becker
none Details | Review
Patch to fix saving state with counted transitions (1.70 KB, patch)
2015-03-31 14:09 UTC, Arne Becker
none Details | Review
Additions to regex test suite (1.00 KB, patch)
2015-03-31 14:14 UTC, Arne Becker
none Details | Review

Description Arne Becker 2015-03-03 16:41:08 UTC
This regexp with a branch and a repetition fails (but should succeed):

./testRegexp "(\d{0,2}x|1\d{0,2}x){2}" 2x100x

while this line succeeds:

./testRegexp "(\d{0,1}x|1\d{0,2}x){2}" 2x100x

Maybe related: #622513
Less likely related: #558551, #425542

Output on success:
Testing (\d{0,1}x|1\d{0,2}x){2}:
saving state: 0:3:idx 0: 2x100x
Saving on nd transition atom 0 for 2 at 0
saving state: 0:3:idx 0: 2x100x
entering state 3
entering state 4
testing count 0: val 0, min 1, max 1
Increasing count 0
saving state: 4:3:idx 2: 100x
Decreasing count 0
Saving on nd transition atom 0 for 1 at 2
saving state: 4:3:idx 2: 100x
Increasing count 0
entering state 3
rollback from state 3 on 48:0
restored state: 4:3:idx 2: 100x
Increasing count 0
entering state 5
saving state: 6:0:idx 4: 0x
entering state 6
entering state 4
testing count 0: val 1, min 1, max 1
resetting count 0 on transition
entering state 7
2x100x: Ok

Output on fail:
Testing (\d{0,2}x|1\d{0,2}x){2}:
saving state: 0:3:idx 0: 2x100x
saving state: 3:0:idx 1: x100x
rollback from state 0 on 120:x
restored state: 3:0:idx 1: x100x
entering state 4
testing count 0: val 0, min 1, max 1
Increasing count 0
saving state: 4:3:idx 2: 100x
saving state: 3:0:idx 3: 00x
Decreasing count 0
Saving on nd transition atom 0 for 0 at 3
saving state: 4:3:idx 3: 00x
Increasing count 0
entering state 3
rollback from state 3 on 48:0
restored state: 4:3:idx 3: 00x
rollback from state 4 on 48:0
restored state: 3:0:idx 3: 00x
rollback from state 3 on 48:0
restored state: 4:3:idx 2: 100x
rollback from state 4 on 49:1
restored state: 0:3:idx 0: 2x100x
rollback from state 0 on 50:2
rollback failed on empty stack
2x100x: Fail
Comment 1 Arne Becker 2015-03-25 09:05:30 UTC
The bug seems to be in xmlFAReduceEpsilonTransitions.
I commented out the call and removed the error condition in the main loop when encountering transitions with NULL atoms. Afterwards, it works correctly.
With xmlFAReduceEpsilonTransitions, it looks like a counter is incorrectly added to one transition. The counter later prevents the transition from matching.
Comment 2 Arne Becker 2015-03-31 14:07:20 UTC
Created attachment 300665 [details] [review]
Patch for correct treatment of quantifiers on atoms

Atoms 1{0,n} and 2{0,n} do intersect
    
The QUANT_RANGE was ignored when comparing atoms but the information if atoms
intersect is important. This adds that test.
Fixes this Regex: (1{0,3}x|2{0,3}x|2x|2){6} for this data: 222x111x2x22x111x
Comment 3 Arne Becker 2015-03-31 14:09:22 UTC
Created attachment 300666 [details] [review]
Patch to fix saving state with counted transitions

Save state on counted transitions before increment
    
The state was unconditionally saved when a transition succeded and there ware
more states. That save happened after incrementing the counter for counted
transitions if there was one which was incorrect. Now the state is saved before
incrementing the counter.
Regex that didnt't work before: (\d{0,2}x|100x){2} with data 2x100x
Comment 4 Arne Becker 2015-03-31 14:14:01 UTC
Created attachment 300667 [details] [review]
Additions to regex test suite

Some test cases for the issues in this bug.
Comment 5 Arne Becker 2015-03-31 14:20:20 UTC
So the bug wasn't in xmlFAReduceEpsilonTransitions after all.

The main bug was the treatment of the counter for counted transitions.
In some cases, it was incremented and the state saved afterwards. The saved counter was wrong in that case. The bug turned up after reducing, because the reducer would put the counted transition and the "counting" transitions in the same state.

The other bug likely made the reducer make some wrong decisions. It's good to have that fixed too.
Comment 6 Arne Becker 2015-04-01 07:34:54 UTC
I forgot to add - I ran the whole test/regexp directory, the results didn't get any worse.
Comment 7 GNOME Infrastructure Team 2021-07-05 13:24:55 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.