GNOME Bugzilla – Bug 723417
New MarkDown parser
Last modified: 2014-02-04 21:22:58 UTC
The technique used by the previous markdown parser was becoming problematic as we added more elements to it. It attempted to perform a global replace for each type of tag. Which is really challenging or impossible to get right for a few reasons. One is that the conversion is not performed on an entire section in many cases since the cdata/code blocks were excluded. This is made worse because we rely so heavily on lookbehind and lookahead in the regular expressions. The one pass global replace makes it really tough to account for any kind of hierarchy. It is also really hard to avoid each regex stomping on the signatures that the others attempt to detect. A better approach is to perform a pass through the text and assemble a block model from it. And then walk over that model, recursing as needed, to converted to docbook markup.
Created attachment 267774 [details] [review] tests: add a more complete test for blocks in lists
Created attachment 267775 [details] [review] New MarkDown parser Much more robust and complete MarkDown parser inspired by ParseDown http://parsedown.org/
Created attachment 267776 [details] [review] Add support for mixed markup and markdown
Created attachment 267961 [details] [review] New MarkDown parser Much more robust and complete MarkDown parser inspired by ParseDown http://parsedown.org/
Created attachment 267962 [details] [review] Add support for mixed markup and markdown
Created attachment 268015 [details] [review] Add support for literal and link span type markdown
One tiny difference I noticed is that for this blob: /** * SECTION:gstsimsyn * @title: GstBtSimSyn * @short_description: simple monophonic audio synthesizer * * Simple monophonic audio synthesizer with a decay envelope and a * state-variable filter. * * <refsect2> * <title>Example launch line</title> * |[ * gst-launch simsyn num-buffers=1000 note="c-4" ! autoaudiosink * ]| Render a sine wave tone. * </refsect2> */ You are creating an extra empty paragraph (<p></p>) above the <div class="refsect2">. If I switch to markup it is gone again. When you output <p></p> - can you skip it if it is empty?
Created attachment 268092 [details] [review] Don't attempt to markup inside <> tags
Created attachment 268093 [details] [review] Use the markdown parser as the primary way of expanding gtkdoc markup
Created attachment 268094 [details] [review] Add test case for mixed docbook sections
Created attachment 268095 [details] [review] Add support for header ids in markdown
There seem to be some extra <p> that are created and some wrongly :/ e.g. in index.html -<span class="refentrytitle"><a href="GstBtSidSyn.html">GstBtSidSyn</a></span><span class="refpurpose"> — c64 sid synthesizer</span> +<span class="refentrytitle"><a href="GstBtSidSyn.html">GstBtSidSyn</a></span><span class="refpurpose"> — <p>c64 sid synthesizer</p> +</span> Here we have the title wrapped in a <p> tag and the closing span should be not on a new line to not add a ' ' to eventual styling. <h2><span class="refentrytitle"><a name="GstBtSimSyn.top_of_page"> </a>GstBtSimSyn</span></h2> -<p>GstBtSimSyn — simple monophonic audio synthesizer</p> +<p>GstBtSimSyn — <p>simple monophonic audio synthesizer</p> +</p> Here we also have the title wrapped in a <p> tag, creating a nested <p> :/
Attachment 267774 [details] pushed as 53fee9d - tests: add a more complete test for blocks in lists Attachment 267961 [details] pushed as 973687e - New MarkDown parser Attachment 267962 [details] pushed as 01eea27 - Add support for mixed markup and markdown Attachment 268015 [details] pushed as 808232d - Add support for literal and link span type markdown Attachment 268092 [details] pushed as 758ebc5 - Don't attempt to markup inside <> tags Attachment 268093 [details] pushed as b22c91c - Use the markdown parser as the primary way of expanding gtkdoc markup Attachment 268094 [details] pushed as b88fae9 - Add test case for mixed docbook sections Attachment 268095 [details] pushed as ef1cdae - Add support for header ids in markdown