GNOME Bugzilla – Bug 588217
g_match_info_fetch_named not return empty string as expected
Last modified: 2018-05-24 11:54:59 UTC
I'm matching "uint32_t *i_n" against pattern(compile_flags: G_REGEX_NO_AUTO_CAPTURE | G_REGEX_OPTIMIZE | G_REGEX_DUPNAMES,match_flags: 0): ^\s*+(?:(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)|(?:const\s+)(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)|(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)(?:\s+const))(?:\s*+(?P<star>\*)\s*+|\s*+)((?P<nm>[nm])|(?P<stride>(?P<arg_class>[isd])s(?P<index>\d?))|(?P<array>(?P<arg_class>[isd])(?P<index>\d?)(_(?P<len1>\d+|[nm](p\d+)?)(x(?P<len2>\d+|[nm](p\d+)?))?)?))\s*+$ Then call g_match_info_fetch_named (param_info, "len2"), it returns NULL instead of an empty string, which is incorrect according to the document (http://library.gnome.org/devel/glib/stable/glib-Perl-compatible-regular-expressions.html#g-match-info-fetch-named)
Created attachment 138170 [details] A small demo program which can reproduce this bug
Hmm, thats quite a complicated expression. Does this problem also happen in simpler cases ? Also, it would be very good to know if the same problem occurs when you are using pcre directly.
Created attachment 138226 [details] a simpler program that can reproduce this bug match "1" against (?P<len1>\d)(?P<len2>\d)? g_match_info_fetch_named (param_info, "len2") return NULL
Created attachment 138424 [details] [review] a patch that might work Can you try this patch ? I'll note that we will not be able to perfectly match what the docs describe, since there we also allow the match_info_fetch functions to be used with match_all, which can produce more matches than there are captures.
Yeah, it works. For match_all, it should not support "g_match_info_fetch_named", since it is not able to capture substrings.
Review of attachment 138424 [details] [review]: Patch looks good to me - I checked: - That capture positions are set to -1 by pcre even if they are greater than the returned number of matches (from pcreapi api docs) - That pcre_fullinfo() is cheap and doesn't involve computation (from reading the sources) Note that this bug doesn't just affect g_match_info_fetch_named(), it also affects g_match_info_fetch() - trailing optional matches return NULL not "" - the following test snippet: GRegex *re = g_regex_new ("a+(b+)?c+", 0, 0, NULL); GMatchInfo *info; if (g_regex_match (re, "aacc", 0, &info)) { const char *b_str = g_match_info_fetch (info, 1); g_assert_cmpstr (b_str, ==, ""); } fails currently.
*** Bug 615706 has been marked as a duplicate of this bug. ***
Bug still happens in 2.22.x (and, according to code, still present in git master).
Created attachment 158674 [details] one more testcase
Is it ok to commit this then?
Created attachment 196054 [details] [review] Improve behaviour of g_match_info_fetch_pos() This function returns the wrong result in some cases, causing unexpected behaviour from other functions.
Created attachment 196055 [details] [review] regex testcase: test for bug #588217 Add a test for bug #588217 to the regex testcase.
Rebased the patch to apply to the current code and added the testcase from comment #9 to the test suite. Unfortunately the change in the patch causes other parts of the testsuite to fail.
Created attachment 196056 [details] [review] regex testcase: test for bug #588217 (add proper attribution to the commit message)
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/229.