GNOME Bugzilla – Bug 672254
Problem scanning multiline "Returns:"/"Return value:" identifiers in GTK-Doc comment blocks
Last modified: 2015-02-07 16:46:04 UTC
For example, the GTK-Doc comment block for GLib's g_get_user_datadir() contains the following: * Return value: a string owned by GLib that must not be modified * or freed. Which ends up in the .gir file as: <function name="get_user_data_dir" c:identifier="g_get_user_data_dir" version="2.6"> <doc xml:whitespace="preserve">Returns a base directory in which to access application data such ...snip... what g_get_user_config_dir() returns. or freed.</doc> <return-value transfer-ownership="none"> <doc xml:whitespace="preserve">a string owned by GLib that must not be modified</doc> <type name="utf8" c:type="gchar*"/> </return-value> </function>
Created attachment 211153 [details] [review] Add comment documenting we're ignoring C++ style comments.
Created attachment 211154 [details] [review] Don't munch /* and */ comment start and end markers. In Sandro Sigala's original c-c++-grammars code, this function was known as "static void skip_comment(void)". It was built this way instead of a lex pattern because it didn't need to save the matched comment in the "yytext" buffer. This is still true as we directly assign the matched comment to the scanner object. So simply prepend and append the markers back where they belong.
Created attachment 211155 [details] [review] Make AnnotationParser._parse_comment() do what gtk-doc does.
Created attachment 211156 [details] [review] Now make AnnotationParser do what gobject-introspection needs it to do.
Created attachment 211157 [details] [review] Fix malformed GTK-Doc comment blocks: add missing colons. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211158 [details] [review] Fix malformed GTK-Doc comment blocks: invalid annotations. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211159 [details] [review] Fix malformed GTK-Doc comment blocks: invalid parameters and tags. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211160 [details] [review] Fix malformed GTK-Doc comment blocks: correct parameter name. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211161 [details] [review] Fix malformed GTK-Doc comment blocks: preserve description indentation. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211162 [details] [review] Fix malformed GTK-Doc comment blocks: no description parts. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211163 [details] [review] Fix malformed GTK-Doc comment blocks: comment end marker. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211164 [details] [review] Fix malformed GTK-Doc comment blocks: invalid empty lines. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211165 [details] [review] Fix malformed GTK-Doc comment blocks: line numbers. AnnotationParser now emmits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211166 [details] [review] update-glib-annotations.py: reduce code duplication
Created attachment 211167 [details] [review] Update GLib annotations, using the improved annotation parser.
Review of attachment 211153 [details] [review]: Looks good
Created attachment 211168 [details] [review] Fix malformed GTK-Doc comment blocks: add missing colons. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211169 [details] [review] Fix malformed GTK-Doc comment blocks: invalid parameters and tags. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211170 [details] [review] Fix malformed GTK-Doc comment blocks: unmark non GTK-Doc comment block. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211171 [details] [review] Fix malformed GTK-Doc comment blocks: mutliline annotations are invalid. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211172 [details] [review] Fix malformed GTK-Doc comment blocks: correct struct name. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211173 [details] [review] Fix malformed GTK-Doc comment blocks: don't confuse GTK-Doc parsers. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Created attachment 211174 [details] [review] Fix malformed GTK-Doc comment blocks: remove repeated comment blocks. Found these thanks to improved gobject-introspection GTK-Doc comment block/annotation parser from:
Review of attachment 211155 [details] [review]: This needs some more work. ::: giscanner/annotationparser.py @@ +453,3 @@ self._blocks = {} def parse(self, comments): Should probably mention somewhere that this mimics the way gtk-doc parses doc comments closely and provide references to their parsing code for future reference. @@ +513,3 @@ + if not symbol: + # maybe its not even meant to be a gtk-doc comment? + #print 'Warning: Symbol name not found at the start of the comment block.' leftover @@ +535,3 @@ + + if return_style == 'broken': + leftover @@ +542,3 @@ + + if since_desc: + tag.set_position(lineno) This seems to be a common pattern, should be refactored into a function/method. @@ +547,3 @@ + + if stability_desc: + # Add the return value description onto the end of the params. Here for instance @@ +552,3 @@ + + if deprecated_desc: + tag = DocTag(block, TAG_RETURNS) Ditto @@ +594,3 @@ + return_start = return_match.group(0) + + if deprecated_desc: Leftover @@ +602,3 @@ + return_desc = line[return_match.end(0):] + in_part = 'return' + There are way too many continue statements here, can they be refactored out so we can avoid doing so many of them? @@ +651,3 @@ + param_desc = line[match.end(0):] + + continue Left over @@ +667,3 @@ + # to the last element in @params. + if not current_param: + if return_match: Leftover, invert if.
Review of attachment 211156 [details] [review]: ::: giscanner/annotationparser.py @@ +27,3 @@ from . import message from .odict import odict +from .annotationpatterns import (COMMENT_START_RE, COMMENT_END_RE, a comes before o @@ +35,3 @@ + +# GTK-Doc comment block parts + PARAMETER_RE, DESCRIPTION_TAG_RE, TAG_RE, Just keep this as it is, as it adds unnecessary noise. @@ +503,3 @@ + line = line[result.end(0):] + + This is not really how we do comments in the rest of the code. @@ +563,3 @@ + colon_column = column_offset + colon_start + marker = ' '*colon_column + '^' + if result: Use "missing ':' at ..." to make it clearer here. @@ +579,3 @@ + marker = ' '*column_offset + '^' + message.warn('ignoring unrecognized GTK-Doc comment block, ' + + # reached). In practice, however, ignoring the block is the Ditto @@ +585,3 @@ + return False + + # pattern by accident. See comment above about doc comments @@ +600,3 @@ + column = result.start('parameter_name') + column_offset + marker = ' '*column + '^' + return False See above @@ +612,3 @@ + returns_seen = True + else: + if result: See above @@ +618,3 @@ + column = result.start('parameter_name') + column_offset + marker = ' '*column + '^' + param_description = result.group('description') See above ::: giscanner/annotationpatterns.py @@ +245,3 @@ + + +if __name__ == '__main__': Are these ran when you run make check?
Review of attachment 211157 [details] [review]: Looks great.
Review of attachment 211158 [details] [review]: Great.
Review of attachment 211159 [details] [review]: Great.
Review of attachment 211160 [details] [review]: Looks good. It's emits, not emmits though
Review of attachment 211161 [details] [review]: Missing .c/.h changes?
Review of attachment 211162 [details] [review]: Sure, if that's what gtk-doc does.
Review of attachment 211163 [details] [review]: Sure, could be squashed but fine.
Review of attachment 211164 [details] [review]: Sure, could be squashed but fine.
Review of attachment 211165 [details] [review]: Okay
Review of attachment 211166 [details] [review]: Okay
Review of attachment 211167 [details] [review]: Do this later when needed, no need to submit patches for this.
Review of attachment 211162 [details] [review]: .
GLib GTK-Doc comment block formatting fixes will go to bug 673385
Created attachment 211323 [details] [review] Add comment documenting we're ignoring C++ style comments.
Created attachment 211325 [details] [review] Add comment documenting we're ignoring C++ style comments.
Created attachment 211326 [details] [review] Don't munch /* and */ comment start and end markers. In Sandro Sigala's original c-c++-grammars code, this function was known as "static void skip_comment(void)". It was built this way instead of a lex pattern because it didn't need to save the matched comment in the "yytext" buffer. This is still true as we directly assign the matched comment to the scanner object. So simply prepend and append the markers back where they belong.
Created attachment 211327 [details] [review] Make AnnotationParser._parse_comment() do what gtk-doc does.
Created attachment 211328 [details] [review] Now make AnnotationParser do what gobject-introspection needs it to do.
Created attachment 211329 [details] [review] Fix malformed GTK-Doc comment blocks: add missing colons. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211330 [details] [review] Fix malformed GTK-Doc comment blocks: invalid annotations. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211331 [details] [review] Fix malformed GTK-Doc comment blocks: invalid parameters and tags. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211332 [details] [review] Fix malformed GTK-Doc comment blocks: correct parameter name. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211333 [details] [review] Fix malformed GTK-Doc comment blocks: preserve description indentation. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211334 [details] [review] Fix malformed GTK-Doc comment blocks: no description parts. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211335 [details] [review] Fix malformed GTK-Doc comment blocks: comment end marker. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211336 [details] [review] Fix malformed GTK-Doc comment blocks: invalid empty lines. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211337 [details] [review] Fix malformed GTK-Doc comment blocks: line numbers. AnnotationParser now emits warnings which are considered as errors by "make check" so fix those warnings...
Created attachment 211338 [details] [review] update-glib-annotations.py: reduce code duplication
Created attachment 211339 [details] [review] Update GLib annotations, using the improved annotation parser.
In reply to comment #24: > Review of attachment 211155 [details] [review]: > > This needs some more work. All leftovers from gtkdoc-mkdb code have been removed. > ::: giscanner/annotationparser.py > @@ +453,3 @@ > self._blocks = {} > > def parse(self, comments): > > Should probably mention somewhere that this mimics the way gtk-doc parses doc > comments closely and provide references to their parsing code for future > reference. Done. > @@ +542,3 @@ > + > + if since_desc: > + tag.set_position(lineno) > > This seems to be a common pattern, should be refactored into a function/method. > > @@ +602,3 @@ > + return_desc = line[return_match.end(0):] > + in_part = 'return' > + > > There are way too many continue statements here, can they be refactored out so > we can avoid doing so many of them? Both concerns are already handled in the next commit: Now make AnnotationParser do what gobject-introspection needs it to do ;) In reply to comment #25 > Review of attachment 211156 [details] [review]: > > ::: giscanner/annotationparser.py > @@ +27,3 @@ > from . import message > from .odict import odict > +from .annotationpatterns import (COMMENT_START_RE, COMMENT_END_RE, > > a comes before o Fixed. > @@ +35,3 @@ > + > +# GTK-Doc comment block parts > + PARAMETER_RE, DESCRIPTION_TAG_RE, TAG_RE, > > Just keep this as it is, as it adds unnecessary noise. Assuming this is about the alignment on the equals character, fixed. > @@ +503,3 @@ > + line = line[result.end(0):] > + > + > > This is not really how we do comments in the rest of the code. I'm fully aware of that. It's a way of saying: it is tempting to refactor the different steps of parsing a GTK-Doc comment block into methods, but keeping it analogous to how gtkdoc-mkdb's ScanSourceFile() function works is more important wrt future maintenance work. The _parse_comment_block() docstring now explains this. Suggestions on how to bring these comments more in line with the rest of the code and still make them jump out (I'd really like the different steps to be recognisable from miles away) are most welcome. Or we simply decide that being a gtkdoc-mkdb look-a-like is not that important and refactor some more. Both options are fine by me :) > @@ +563,3 @@ > + colon_column = column_offset + colon_start > + marker = ' '*colon_column + '^' > + if result: > > Use "missing ':' at ..." to make it clearer here. Replaced all instances of \' with ' and enclosed offending string literals with double quotes. > ::: giscanner/annotationpatterns.py > @@ +245,3 @@ > + > + > +if __name__ == '__main__': > > Are these ran when you run make check? They do now. Also fixed the test suite so it runs on Python 2.6 (unittest.TestCase.assertIsNone and unittest.TestCase.assertIsNotNone not yet available) and for good measure Python 3.2 (dict.iteritems() no longer available) in addition to Python 2.7. And In reply to comment #29: > Review of attachment 211160 [details] [review]: > > Looks good. > > It's emits, not emmits though Oops. Fixed in all relevant commit messages. In reply to comment #30 and #37: > Review of attachment 211161 [details] [review]: > > Missing .c/.h changes? No, the old annotationparser discarded left hand side whitespace, destroying inline example code indentation. The new parser does not touch left hand side whitespace, so the expected test output has to be changed to reflect that change. In reply to comment #36: > Review of attachment 211167 [details] [review]: > > Do this later when needed, no need to submit patches for this. Let's consider this part of the deal... Additionally, pep8.py (well, g-i's specific copy of it) compliance for annotationparser.py and annotationpatters.py has been taken care of.
Review of attachment 211325 [details] [review]: Looks good.
Review of attachment 211326 [details] [review]: Looks good.
Review of attachment 211328 [details] [review]: Looks better, can be pushed with the changed mentioned below incorporated. ::: giscanner/annotationparser.py @@ +837,2 @@ else: + result = MULTILINE_ANNOTATION_CONTINUATION_RE.search(line) This is duplicating code above, should be refactored to a function.
Review of attachment 211329 [details] [review]: Looks good. Can be squashed.
Review of attachment 211330 [details] [review]: Looks good. Can be squashed.
Review of attachment 211331 [details] [review]: Looks good. Can be squashed.
Review of attachment 211332 [details] [review]: Looks good. Can be squashed.
Review of attachment 211333 [details] [review]: Looks good. Can be squashed.
Review of attachment 211334 [details] [review]: Looks good. Can be squashed.
Review of attachment 211335 [details] [review]: Looks good. Can be squashed.
Review of attachment 211336 [details] [review]: Looks good. Can be squashed.
Review of attachment 211337 [details] [review]: Why did the line numbers change?
Review of attachment 211338 [details] [review]: Looks good.
Review of attachment 211339 [details] [review]: Looks good. Wait until this is committed to glib though.
Created attachment 211343 [details] [review] Now make AnnotationParser do what gobject-introspection needs it to do. (In reply to comment #59) > Review of attachment 211328 [details] [review]: > > Looks better, can be pushed with the changed mentioned below incorporated. > > ::: giscanner/annotationparser.py > @@ +837,2 @@ > else: > + result = MULTILINE_ANNOTATION_CONTINUATION_RE.search(line) > > This is duplicating code above, should be refactored to a function. Done
Created attachment 211344 [details] [review] Fix malformed GTK-Doc comment blocks (In reply to comment #60, #61, #62, #63, #64, #65, #66, #67) > Review of attachment 211329 [details] [review]: > > Looks good. Can be squashed. Done.
(In reply to comment #68) > Review of attachment 211337 [details] [review]: > > Why did the line numbers change? The original AnnotationParser's generated line numbers where off by 1.
Review of attachment 211343 [details] [review]: Looks great
Review of attachment 211344 [details] [review]: Looks great, awesome!
Attachment 211325 [details] pushed as 571a3ce - Add comment documenting we're ignoring C++ style comments. Attachment 211326 [details] pushed as 3288b28 - Don't munch /* and */ comment start and end markers. Attachment 211327 [details] pushed as f1d2547 - Make AnnotationParser._parse_comment() do what gtk-doc does. Attachment 211338 [details] pushed as 79c2ee4 - update-glib-annotations.py: reduce code duplication Attachment 211339 [details] pushed as 5ab6a47 - Update GLib annotations, using the improved annotation parser. Attachment 211343 [details] pushed as 182fdfe - Now make AnnotationParser do what gobject-introspection needs it to do.
I pushed these for Dieter today.
This introduced a regression that can be seen when building libgnome-keyring: GISCAN GnomeKeyring-1.0.gir gnome-keyring-memory.h:32: Warning: GnomeKeyring: ignoring unrecognized GTK-Doc comment block, identifier not found: * gnome-keyring-memory:Short_Description: ^ Traceback (most recent call last):
+ Trace 230018
sys.exit(scanner_main(sys.argv))
exported_packages, options.c_includes)
c_includes)
self._write_namespace(namespace, shlibs)
self._write_node(node)
self._write_record(node)
self._write_generic(record)
self.write_tag('attribute', [('name', key), ('value', value)])
len(prefix) + len(suffix))
make[3]: *** [GnomeKeyring-1.0.gir] Error 1
Created attachment 211432 [details] [review] Split parameter and tag storage in annotationparser parse tree (In reply to comment #78) > This introduced a regression that can be seen when building libgnome-keyring: > > GISCAN GnomeKeyring-1.0.gir > gnome-keyring-memory.h:32: Warning: GnomeKeyring: ignoring unrecognized GTK-Doc > comment block, identifier not found: > * gnome-keyring-memory:Short_Description: > ^ > Traceback (most recent call last): This patch should fix it.
Review of attachment 211432 [details] [review]: Looks great.
Huge thanks to all involved!
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]