After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 723417 - New MarkDown parser
New MarkDown parser
Status: RESOLVED FIXED
Product: gtk-doc
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: 1.20
Assigned To: gtk-doc maintainers
gtk-doc maintainers
Depends on:
Blocks:
 
 
Reported: 2014-02-01 10:14 UTC by William Jon McCann
Modified: 2014-02-04 21:22 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
tests: add a more complete test for blocks in lists (1.04 KB, patch)
2014-02-01 10:14 UTC, William Jon McCann
committed Details | Review
New MarkDown parser (17.01 KB, patch)
2014-02-01 10:14 UTC, William Jon McCann
none Details | Review
Add support for mixed markup and markdown (3.25 KB, patch)
2014-02-01 10:15 UTC, William Jon McCann
none Details | Review
New MarkDown parser (17.01 KB, patch)
2014-02-03 15:09 UTC, William Jon McCann
committed Details | Review
Add support for mixed markup and markdown (3.58 KB, patch)
2014-02-03 15:09 UTC, William Jon McCann
committed Details | Review
Add support for literal and link span type markdown (4.68 KB, patch)
2014-02-03 22:56 UTC, William Jon McCann
committed Details | Review
Don't attempt to markup inside <> tags (2.23 KB, patch)
2014-02-04 20:16 UTC, William Jon McCann
committed Details | Review
Use the markdown parser as the primary way of expanding gtkdoc markup (21.91 KB, patch)
2014-02-04 20:16 UTC, William Jon McCann
committed Details | Review
Add test case for mixed docbook sections (783 bytes, patch)
2014-02-04 20:16 UTC, William Jon McCann
committed Details | Review
Add support for header ids in markdown (3.01 KB, patch)
2014-02-04 20:17 UTC, William Jon McCann
committed Details | Review

Description William Jon McCann 2014-02-01 10:14:48 UTC
The technique used by the previous markdown parser was becoming problematic
as we added more elements to it. It attempted to perform a global replace
for each type of tag. Which is really challenging or impossible to get right
for a few reasons. One is that the conversion is not performed on an entire
section in many cases since the cdata/code blocks were excluded. This is
made worse because we rely so heavily on lookbehind and lookahead in the
regular expressions. The one pass global replace makes it really tough
to account for any kind of hierarchy. It is also really hard to avoid
each regex stomping on the signatures that the others attempt to detect.

A better approach is to perform a pass through the text and assemble a
block model from it. And then walk over that model, recursing as needed,
to converted to docbook markup.
Comment 1 William Jon McCann 2014-02-01 10:14:50 UTC
Created attachment 267774 [details] [review]
tests: add a more complete test for blocks in lists
Comment 2 William Jon McCann 2014-02-01 10:14:59 UTC
Created attachment 267775 [details] [review]
New MarkDown parser

Much more robust and complete MarkDown parser inspired by
ParseDown http://parsedown.org/
Comment 3 William Jon McCann 2014-02-01 10:15:02 UTC
Created attachment 267776 [details] [review]
Add support for mixed markup and markdown
Comment 4 William Jon McCann 2014-02-03 15:09:18 UTC
Created attachment 267961 [details] [review]
New MarkDown parser

Much more robust and complete MarkDown parser inspired by
ParseDown http://parsedown.org/
Comment 5 William Jon McCann 2014-02-03 15:09:22 UTC
Created attachment 267962 [details] [review]
Add support for mixed markup and markdown
Comment 6 William Jon McCann 2014-02-03 22:56:06 UTC
Created attachment 268015 [details] [review]
Add support for literal and link span type markdown
Comment 7 Stefan Sauer (gstreamer, gtkdoc dev) 2014-02-04 08:10:14 UTC
One tiny difference I noticed is that for this blob:
/**
 * SECTION:gstsimsyn
 * @title: GstBtSimSyn
 * @short_description: simple monophonic audio synthesizer
 *
 * Simple monophonic audio synthesizer with a decay envelope and a
 * state-variable filter.
 *
 * <refsect2>
 * <title>Example launch line</title>
 * |[
 * gst-launch simsyn num-buffers=1000 note="c-4" ! autoaudiosink
 * ]| Render a sine wave tone.
 * </refsect2>
 */

You are creating an extra empty paragraph (<p></p>) above the <div class="refsect2">. If I switch to markup it is gone again. When you output <p></p> - can you skip it if it is empty?
Comment 8 William Jon McCann 2014-02-04 20:16:53 UTC
Created attachment 268092 [details] [review]
Don't attempt to markup inside <> tags
Comment 9 William Jon McCann 2014-02-04 20:16:57 UTC
Created attachment 268093 [details] [review]
Use the markdown parser as the primary way of expanding gtkdoc markup
Comment 10 William Jon McCann 2014-02-04 20:16:59 UTC
Created attachment 268094 [details] [review]
Add test case for mixed docbook sections
Comment 11 William Jon McCann 2014-02-04 20:17:02 UTC
Created attachment 268095 [details] [review]
Add support for header ids in markdown
Comment 12 Stefan Sauer (gstreamer, gtkdoc dev) 2014-02-04 20:57:18 UTC
There seem to be some extra <p> that are created and some wrongly :/

e.g. in index.html
-<span class="refentrytitle"><a href="GstBtSidSyn.html">GstBtSidSyn</a></span><span class="refpurpose"> — c64 sid synthesizer</span>
+<span class="refentrytitle"><a href="GstBtSidSyn.html">GstBtSidSyn</a></span><span class="refpurpose"> — <p>c64 sid synthesizer</p>
+</span>

Here we have the title wrapped in a <p> tag and the closing span should be not on a new line to not add a ' ' to eventual styling.


<h2><span class="refentrytitle"><a name="GstBtSimSyn.top_of_page"> </a>GstBtSimSyn</span></h2>
-<p>GstBtSimSyn — simple monophonic audio synthesizer</p>
+<p>GstBtSimSyn — <p>simple monophonic audio synthesizer</p>
+</p>

Here we also have the title wrapped in a <p> tag, creating a nested <p> :/
Comment 13 William Jon McCann 2014-02-04 21:22:35 UTC
Attachment 267774 [details] pushed as 53fee9d - tests: add a more complete test for blocks in lists
Attachment 267961 [details] pushed as 973687e - New MarkDown parser
Attachment 267962 [details] pushed as 01eea27 - Add support for mixed markup and markdown
Attachment 268015 [details] pushed as 808232d - Add support for literal and link span type markdown
Attachment 268092 [details] pushed as 758ebc5 - Don't attempt to markup inside <> tags
Attachment 268093 [details] pushed as b22c91c - Use the markdown parser as the primary way of expanding gtkdoc markup
Attachment 268094 [details] pushed as b88fae9 - Add test case for mixed docbook sections
Attachment 268095 [details] pushed as ef1cdae - Add support for header ids in markdown