GNOME Bugzilla – Bug 746162
formulas creating invalid HTML code
Last modified: 2016-01-22 04:54:03 UTC
Created attachment 299331 [details] doxygen project to reproduce the issue A doxygen code like /** \mainpage <center> \f$a\f$ b \f[c \f] </center> */ generates an HTML code that does not validate: <div class="textblock"><center> <img class="formulaInl" alt="$a$" src="form_0.png"/> b </p><p class="formulaDsp"> <img class="formulaDsp" alt="\[c \]" src="form_1.png"/> </p> </center> </div> That is, after "b" is printed, Doxygen adds a "</p>", while there is no opening "<p>". If I change the code to /** \mainpage <center> \f$a\f$ b \f[c \f] </center> */ the generated HTML code is valid: <div class="textblock"><center> <img class="formulaInl" alt="$a$" src="form_0.png"/> b</center><center> <p class="formulaDsp"> <img class="formulaDsp" alt="\[c \]" src="form_1.png"/> </p> </center> </div>
Our workarounds for issues that seem to be related to this bug are growing. Can you give a hint where to start looking in the doxygen source if I were trying to come up with a fix for this?
The </p> is written in docvisitor.cpp in the routine forceEndParagraph which is called from HtmlDocVisitor::visit(DocFormula *f), so this might be the starting point. There is probably also a relation with void HtmlDocVisitor::visit(DocStyleChange *s) and specifically the case DocStyleChange::Center
Thanks for the hints. It's good to know where the "</p>" comes from. So, forceEndParagraph(), which writes the </p> does that if (n->parent() && n->parent()->kind()==DocNode::Kind_Para) and certain other conditions on the children of n are satisfied. Here, n is the node for the "\f[c \f]" formula. And n->parent() is a node of the Kind_Para kind that is actually processed before the DocStyleChange-node that holds the "<center>" marker. So doxygen does not seem to make a distinction into hierarchies on the input "<center>\f$a\f$ b \f[c \f]</center>". I mean, why isn't the DocStyleChange node ("<center>") the parent of the nodes for "\f$a\f$", "b", and "\f[c \f]" ? At the moment, it just puts in a "</p>" and doesn't bother that it has opened a "<center>" before. Sorry, this isn't the "ok, I have a quick look and come up with a patch"-reply that I had hoped for.
I considered making the style change nodes part of the hierarchy, but this makes the parser a lot more complex. Instead, I decided to keep the style changes as "delta" nodes, and add a bit more complexity to the html output to produce proper output also in this case. Although this is an improvement, it is not perfect. If you for instance use <b>..</b> across commands that need to be outside of a <p>..</p>, then one can still produce invalid html output. In the ideal situation the user would be present with a warning from the parser, or the bold section would nicely be broken up in parts.
This bug was previously marked ASSIGNED, which means it should be fixed in doxygen version 1.8.11. Please verify if this is indeed the case. Reopen the bug if you think it is not fixed and please include any additional information that you think can be relevant (preferably in the form of a self-contained example).
Thank you very much. From a quick look, it seems that the new doxygen has this fixed, indeed.