GNOME Bugzilla – Bug 623968
gtkdoc-mkdb generates invalid xml from sgml in inline comments
Last modified: 2010-10-20 08:57:13 UTC
I'm using SGML in inline SECTION comments. See example here: http://trac.bjourne.webfactional.com/browser/gtkimageview/src/gtkanimview.c?rev=692 gtkdoc-mkdb used to just insert that verbatim into the output xml file, but now it inserts a spurious extra <para> tag in the xml file for reasons I don't fully understand. It leads to a parser error like this "../xml/gtkanimview.xml:99: parser error : Opening and ending tag mismatch: para line 65 and refsect1" and broken documentation.
The problem is not present with gtkdoc 1.11 so the regression must be newer than that.
I made a small test: /** * bug_623968a: * * <para> * test * </para> * <refsect2> * <title>subsect</title> * <para> * test * </para> * </refsect2> **/ and it is the blank line before the doc body. There has been a change in last release for taking care of some issues causes by gtk-doc inserting "</para><para>" for blank lines. I'll check the changes now.
I probably never notices as in e.g. gstreamer docs I do: /** * bug_623968a: * * test content. bla bla bla .... * * <refsect2> * <title>subsect</title> * <para> * test * </para> * </refsect2> **/ commit 235669e75846078b78580e90332f6d55296f8110 Author: Stefan Kost <ensonic@users.sf.net> Date: Mon Jul 12 13:00:59 2010 +0300 mkdb: better xml fixup after blank line expansion Our previous heuristics have been a bit too simple. Make it works for different refsect level and also docs starting with a para. Fixes #623968 commit af84db8455e3db55341ed0921d1083209a06c851 Author: Stefan Kost <ensonic@users.sf.net> Date: Mon Jul 12 12:59:58 2010 +0300 tests: new test for bug #623968
This broke the gio docs.
Ryan, can you be more specific? Whats the error? Which symbol?
i end up with these errors running git master against git master: ../xml/gfileattribute.xml:141: parser error : Opening and ending tag mismatch: refsect1 line 48 and para </para> ^ ../xml/gfileattribute.xml:229: parser error : Opening and ending tag mismatch: refentry line 7 and para </para> ^ ../xml/gfileattribute.xml:230: parser error : Extra content at the end of the document </refsect1> line 141, for example has: </tbody> </tgroup> </table> </para> </para> <--- this is 141 <para> Please note that these are not all of the possible namespaces. More namespaces can be added from GIO modules or by individual applications. For more information about writing GIO modules, see <link linkend="GIOModule"><type>GIOModule</type></link>. of course, at the top of that paragraph is only one <para> here's the source: http://git.gnome.org/browse/glib/tree/gio/gfileattribute.c#n120 * </tbody> * </tgroup> * </table> * </para> * <-- there is a trailing space on this line * Please note that these are not all of the possible namespaces. * More namespaces can be added from GIO modules or by individual applications. * For more information about writing GIO modules, see #GIOModule.
I am not sure why this worked before, but the comment in gfileattribute.c is wrong: * * Classes that implement #GFileIface will create a #GFileAttributeInfoList and * install default keys and values for their given file system, architecture, * and other possible implementation details (e.g., on a UNIX system, a file * attribute key will be registered for the user id for a given file). * * <para> * <table> gtk-doc will insert </para><para> for each empty line. Either you ommit the <para> before the <table> or you wrap all the non-xml paragraphs above with one big <para>....</para> and remove the blank line.
The problem is not that <para> is being nested (I think that's actually OK with docbook). The problem is rather that there is a </para> without a <para>. Regardless of what the source documentation says, as long as <para> and </para> tags are balanced, gtk-doc should not output XML that has unbalanced <para> and </para> tags. The input is balanced. At the start, <para><table> at the end, </table></para> But the gtk-doc output has at the start: <para><table> and at the end: </table></para></para> No matter how you slice it, that's gotta be a gtk-doc bug.
It should be fine again since I reverted the change. The 'algorithm' is as below: 1.) take the whole doc-body bla1 bla2 2.) wrap it in <para>...</para> <para> bla1 bla2 </para> 3.) replace blank lines with </para></para> <para> bla1 </para></para> bla2 </para> This causes some problems: bla1 <refsect2>bla2</refsect2> becomes: <para> bla1 </para></para> <refsect2>bla2</refsect2> </para> which is wellformed, but invalid as a refsect can't be part of a <para> :/ Summa, summarum. Previous attempts to handle it were too naiive. Its a bit tough as one can put any docbook into the docs and I can't even use a xml parser in perl without adding dependencies :/ My current plan is to wrap chunk by chunk and only wrap chunks not already wrapped with xml tags into <para>...</para>.
Please every body try the git version and let me know if you are okay with it.
Please reopen if it is still an issue. And it is a good change to try again now, as I'd like to release 1.16.