After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 588217 - g_match_info_fetch_named not return empty string as expected
g_match_info_fetch_named not return empty string as expected
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gregex
2.20.x
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
: 615706 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2009-07-10 05:49 UTC by cee1
Modified: 2018-05-24 11:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
A small demo program which can reproduce this bug (1.80 KB, text/plain)
2009-07-10 05:51 UTC, cee1
  Details
a simpler program that can reproduce this bug (1001 bytes, text/plain)
2009-07-11 04:05 UTC, cee1
  Details
a patch that might work (973 bytes, patch)
2009-07-15 05:17 UTC, Matthias Clasen
accepted-commit_now Details | Review
one more testcase (731 bytes, text/x-csrc)
2010-04-14 06:18 UTC, Marat Radchenko
  Details
Improve behaviour of g_match_info_fetch_pos() (1.30 KB, patch)
2011-09-09 00:44 UTC, Allison Karlitskaya (desrt)
none Details | Review
regex testcase: test for bug #588217 (1.59 KB, patch)
2011-09-09 00:44 UTC, Allison Karlitskaya (desrt)
none Details | Review
regex testcase: test for bug #588217 (1.65 KB, patch)
2011-09-09 00:47 UTC, Allison Karlitskaya (desrt)
none Details | Review

Description cee1 2009-07-10 05:49:39 UTC
I'm matching "uint32_t *i_n" against pattern(compile_flags: G_REGEX_NO_AUTO_CAPTURE | G_REGEX_OPTIMIZE | G_REGEX_DUPNAMES,match_flags: 0):

^\s*+(?:(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)|(?:const\s+)(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)|(?P<type>[_a-zA-Z][_a-zA-Z0-9]*+)(?:\s+const))(?:\s*+(?P<star>\*)\s*+|\s*+)((?P<nm>[nm])|(?P<stride>(?P<arg_class>[isd])s(?P<index>\d?))|(?P<array>(?P<arg_class>[isd])(?P<index>\d?)(_(?P<len1>\d+|[nm](p\d+)?)(x(?P<len2>\d+|[nm](p\d+)?))?)?))\s*+$

Then call g_match_info_fetch_named (param_info, "len2"), it returns NULL instead of an empty string, which is incorrect according to the document (http://library.gnome.org/devel/glib/stable/glib-Perl-compatible-regular-expressions.html#g-match-info-fetch-named)
Comment 1 cee1 2009-07-10 05:51:27 UTC
Created attachment 138170 [details]
A small demo program which can reproduce this bug
Comment 2 Matthias Clasen 2009-07-10 15:12:47 UTC
Hmm, thats quite a complicated expression.

Does this problem also happen in simpler cases ?

Also, it would be very good to know if the same problem occurs when you are using pcre directly.
Comment 3 cee1 2009-07-11 04:05:17 UTC
Created attachment 138226 [details]
a simpler program that can reproduce this bug

 match "1" against (?P<len1>\d)(?P<len2>\d)?
 g_match_info_fetch_named (param_info, "len2") return NULL
Comment 4 Matthias Clasen 2009-07-15 05:17:55 UTC
Created attachment 138424 [details] [review]
a patch that might work

Can you try this patch ? 

I'll note that we will not be able to perfectly match what the docs describe, since there we also allow the match_info_fetch functions to be used with
match_all, which can produce more matches than there are captures.
Comment 5 cee1 2009-07-18 07:31:11 UTC
Yeah, it works.
For match_all, it should not support "g_match_info_fetch_named", since it is not able to capture substrings.
Comment 6 Owen Taylor 2010-04-13 16:11:26 UTC
Review of attachment 138424 [details] [review]:

Patch looks good to me - I checked:

 - That capture positions are set to -1 by pcre even if they are greater than the returned number of matches (from pcreapi api docs)
 - That pcre_fullinfo() is cheap and doesn't involve computation (from reading the sources)

Note that this bug doesn't just affect g_match_info_fetch_named(), it also affects g_match_info_fetch() - trailing optional matches return NULL not "" - the following test snippet:

  GRegex *re = g_regex_new ("a+(b+)?c+", 0, 0, NULL);
  GMatchInfo *info;
  if (g_regex_match (re, "aacc", 0, &info))
    {
      const char *b_str = g_match_info_fetch (info, 1);

      g_assert_cmpstr (b_str, ==, "");
    }

fails currently.
Comment 7 Marat Radchenko 2010-04-14 06:08:36 UTC
*** Bug 615706 has been marked as a duplicate of this bug. ***
Comment 8 Marat Radchenko 2010-04-14 06:12:31 UTC
Bug still happens in 2.22.x (and, according to code, still present in git master).
Comment 9 Marat Radchenko 2010-04-14 06:18:26 UTC
Created attachment 158674 [details]
one more testcase
Comment 10 Kjartan Maraas 2010-09-07 07:22:36 UTC
Is it ok to commit this then?
Comment 11 Allison Karlitskaya (desrt) 2011-09-09 00:44:07 UTC
Created attachment 196054 [details] [review]
Improve behaviour of g_match_info_fetch_pos()

This function returns the wrong result in some cases, causing unexpected
behaviour from other functions.
Comment 12 Allison Karlitskaya (desrt) 2011-09-09 00:44:16 UTC
Created attachment 196055 [details] [review]
regex testcase: test for bug #588217

Add a test for bug #588217 to the regex testcase.
Comment 13 Allison Karlitskaya (desrt) 2011-09-09 00:45:27 UTC
Rebased the patch to apply to the current code and added the testcase from comment #9 to the test suite.

Unfortunately the change in the patch causes other parts of the testsuite to fail.
Comment 14 Allison Karlitskaya (desrt) 2011-09-09 00:47:11 UTC
Created attachment 196056 [details] [review]
regex testcase: test for bug #588217

(add proper attribution to the commit message)
Comment 15 GNOME Infrastructure Team 2018-05-24 11:54:59 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/229.