After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 623968 - gtkdoc-mkdb generates invalid xml from sgml in inline comments
gtkdoc-mkdb generates invalid xml from sgml in inline comments
Status: RESOLVED FIXED
Product: gtk-doc
Classification: Platform
Component: general
1.14
Other Linux
: Normal major
: 1.16
Assigned To: gtk-doc maintainers
gtk-doc maintainers
Depends on:
Blocks:
 
 
Reported: 2010-07-09 17:20 UTC by Björn Lindqvist
Modified: 2010-10-20 08:57 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Björn Lindqvist 2010-07-09 17:20:39 UTC
I'm using SGML in inline SECTION comments. See example here: http://trac.bjourne.webfactional.com/browser/gtkimageview/src/gtkanimview.c?rev=692 gtkdoc-mkdb used to just insert that verbatim into the output xml file, but now it inserts a spurious extra <para> tag in the xml file for reasons I don't fully understand. It leads to a parser error like this "../xml/gtkanimview.xml:99: parser error : Opening and ending tag mismatch: para line 65 and refsect1" and broken documentation.
Comment 1 Björn Lindqvist 2010-07-09 17:52:24 UTC
The problem is not present with gtkdoc 1.11 so the regression must be newer than that.
Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-12 09:09:52 UTC
I made a small test:
/**
 * bug_623968a:
 * 
 * <para>
 *   test
 * </para>
 * <refsect2>
 *   <title>subsect</title>
 *   <para>
 *     test
 *   </para>  
 * </refsect2>
 **/

and it is the blank line before the doc body. There has been a change in last release for taking care of some issues causes by gtk-doc inserting "</para><para>" for blank lines. I'll check the changes now.
Comment 3 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-12 10:05:17 UTC
I probably never notices as in e.g. gstreamer docs I do:

/**
 * bug_623968a:
 * 
 * test content. bla bla bla ....
 *
 * <refsect2>
 *   <title>subsect</title>
 *   <para>
 *     test
 *   </para>  
 * </refsect2>
 **/


commit 235669e75846078b78580e90332f6d55296f8110
Author: Stefan Kost <ensonic@users.sf.net>
Date:   Mon Jul 12 13:00:59 2010 +0300

    mkdb: better xml fixup after blank line expansion
    
    Our previous heuristics have been a bit too simple. Make it works for different
    refsect level and also docs starting with a para.
    
    Fixes #623968

commit af84db8455e3db55341ed0921d1083209a06c851
Author: Stefan Kost <ensonic@users.sf.net>
Date:   Mon Jul 12 12:59:58 2010 +0300

    tests: new test for bug #623968
Comment 4 Allison Karlitskaya (desrt) 2010-07-12 22:25:30 UTC
This broke the gio docs.
Comment 5 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-13 07:24:48 UTC
Ryan, can you be more specific? Whats the error? Which symbol?
Comment 6 Allison Karlitskaya (desrt) 2010-07-13 13:10:21 UTC
i end up with these errors running git master against git master:

../xml/gfileattribute.xml:141: parser error : Opening and ending tag mismatch: refsect1 line 48 and para
</para>
       ^
../xml/gfileattribute.xml:229: parser error : Opening and ending tag mismatch: refentry line 7 and para
</para>
       ^
../xml/gfileattribute.xml:230: parser error : Extra content at the end of the document
</refsect1>


line 141, for example has:

</tbody>
</tgroup>
</table>
</para>
</para>            <--- this is 141
<para>
Please note that these are not all of the possible namespaces.
More namespaces can be added from GIO modules or by individual applications. 
For more information about writing GIO modules, see <link linkend="GIOModule"><type>GIOModule</type></link>.


of course, at the top of that paragraph is only one <para>

here's the source:
http://git.gnome.org/browse/glib/tree/gio/gfileattribute.c#n120

 * </tbody>
 * </tgroup>
 * </table>
 * </para>
 *                 <-- there is a trailing space on this line
 * Please note that these are not all of the possible namespaces.
 * More namespaces can be added from GIO modules or by individual applications. 
 * For more information about writing GIO modules, see #GIOModule.
Comment 7 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-13 14:51:37 UTC
I am not sure why this worked before, but the comment in gfileattribute.c is wrong:

 * 
 * Classes that implement #GFileIface will create a #GFileAttributeInfoList and 
 * install default keys and values for their given file system, architecture, 
 * and other possible implementation details (e.g., on a UNIX system, a file 
 * attribute key will be registered for the user id for a given file).
 * 
 * <para>
 * <table>

gtk-doc will insert </para><para> for each empty line. Either you ommit the <para> before the <table> or you wrap all the non-xml paragraphs above with one big <para>....</para> and remove the blank line.
Comment 8 Allison Karlitskaya (desrt) 2010-07-13 15:06:28 UTC
The problem is not that <para> is being nested (I think that's actually OK with docbook).

The problem is rather that there is a </para> without a <para>.

Regardless of what the source documentation says, as long as <para> and </para> tags are balanced, gtk-doc should not output XML that has unbalanced <para> and </para> tags.

The input is balanced.  At the start, <para><table>  at the end, </table></para>

But the gtk-doc output has at the start:

<para><table>

and at the end:

</table></para></para>



No matter how you slice it, that's gotta be a gtk-doc bug.
Comment 9 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-13 20:22:12 UTC
It should be fine again since I reverted the change. The 'algorithm' is as below:

1.) take the whole doc-body

bla1

bla2

2.) wrap it in <para>...</para>
<para>
bla1

bla2
</para>

3.) replace blank lines with </para></para>
<para>
bla1
</para></para>
bla2
</para>

This causes some problems:

bla1

<refsect2>bla2</refsect2>

becomes:
<para>
bla1
</para></para>
<refsect2>bla2</refsect2>
</para>

which is wellformed, but invalid as a refsect can't be part of a <para> :/

Summa, summarum. Previous attempts to handle it were too naiive. Its a bit tough as one can put any docbook into the docs and I can't even use a xml parser in perl without adding dependencies :/

My current plan is to wrap chunk by chunk and only wrap chunks not already wrapped with xml tags into <para>...</para>.
Comment 10 Stefan Sauer (gstreamer, gtkdoc dev) 2010-09-09 07:43:42 UTC
Please every body try the git version and let me know if you are okay with it.
Comment 11 Stefan Sauer (gstreamer, gtkdoc dev) 2010-10-20 08:57:13 UTC
Please reopen if it is still an issue. And it is a good change to try again now, as I'd like to release 1.16.