Bug 337518 – GMarkup: Subparser support

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 337518 - GMarkup: Subparser support


Summary:	GMarkup: Subparser support


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	2.10.x
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2006-04-06 15:56 UTC by Johan (not receiving bugmail) Dahlin
Modified:	2008-07-10 08:45 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
patch (4.93 KB, patch) 2006-04-06 16:04 UTC, Johan (not receiving bugmail) Dahlin	none	Details \| Review
sub contexts (4.98 KB, patch) 2006-04-10 13:24 UTC, Johan (not receiving bugmail) Dahlin	none	Details \| Review
v3: use a map of stacks to store sub parsers (6.83 KB, patch) 2006-04-17 21:40 UTC, Johan (not receiving bugmail) Dahlin	none	Details \| Review
my idea (5.31 KB, text/x-csrc) 2006-04-19 02:07 UTC, Matthias Clasen		Details
add some subparser support (22.47 KB, patch) 2008-02-11 08:11 UTC, Allison Karlitskaya (desrt)	needs-work	Details \| Review
changes as requested (26.95 KB, patch) 2008-07-05 08:51 UTC, Allison Karlitskaya (desrt)	accepted-commit_now	Details \| Review
revised (25.93 KB, patch) 2008-07-10 08:44 UTC, Allison Karlitskaya (desrt)	committed	Details \| Review

Description Johan (not receiving bugmail) Dahlin 2006-04-06 15:56:35 UTC

GMarkup should provide a way of forcing passthrough so CDATA can be avoided to send fragments of a markup to another parser.

Main use case is GtkBuilder where everything between <ui>..</ui> should be sent to the parser in GtkUIManager.

Comment 1 Johan (not receiving bugmail) Dahlin 2006-04-06 16:04:12 UTC

Created attachment 62866 [details] [review]
patch

The patch is designed for the following use case:

1. force_passthrough is set in end_element
2. force_passthrough is unset in the passthrough handler

If the idea is not rejected, I'll write proper documentation.

Comment 2 Johan (not receiving bugmail) Dahlin 2006-04-06 16:08:35 UTC

Comment on attachment 62866 [details] [review]
patch

>Index: glib/gmarkup.c

>+void
>+g_markup_parse_context_force_passthrough (GMarkupParseContext *context,
>+					  gboolean             passthrough,
>+					  gboolean             remove_last)
>+{
>+  f (passthrough)

Typo, ignore this.

>+    {
>+      if (remove_last)
>+        {
>+	  g_free (context->tag_stack->data);
>+	  context->tag_stack = g_slist_delete_link (context->tag_stack,
>+						    context->tag_stack);

I just realized that this tag should probably be sent to the passthrough handler, aswell.

Comment 3 Matthias Clasen 2006-04-06 19:41:28 UTC

Can you give me a more complete example of how this would be used ?
I am not sure I get it. 

So, in the start handler for <ui>, you set force-passthrough, and then
everything up to </ui> gets pushed through the passthrough handler ?

I am not sure that I like that idea. Maybe it would be nicer to stack parsers
somehow, ie add some way to say: "when you meet a <ui> element, switch to 
parser foo" and then automatically pop the parser again when it is done 
parsing the </ui>. Not sure if thats feasible, though.

Comment 4 Johan (not receiving bugmail) Dahlin 2006-04-06 19:53:23 UTC

> Can you give me a more complete example of how this would be used ?

start_handler:
  if element == "ui"
    force_passthrough(TRUE)

passthrough_handler:
  if current_tag == "ui":
    if data.endswith("</ui>"):
      force_passthrough(FALSE)
      uimanager.add_ui_from_string (stored.data)
    else:
      stored.data += data

> So, in the start handler for <ui>, you set force-passthrough, and then
> everything up to </ui> gets pushed through the passthrough handler ?

Correct.

> I am not sure that I like that idea. Maybe it would be nicer to stack parsers
> somehow, ie add some way to say: "when you meet a <ui> element, switch to 
> parser foo" and then automatically pop the parser again when it is done 
> parsing the </ui>. Not sure if thats feasible, though.

I can have a look at that, but I wanted to try out this approach first, since 
it seemed simpler to implement.

Comment 5 Johan (not receiving bugmail) Dahlin 2006-04-10 13:24:34 UTC

Created attachment 63110 [details] [review]
sub contexts

This is another approach, it adds an api for assigning a "sub" context for a specific element. Only one sub context is supported and the original context will be restored when the end of the element is reached.

Comment 6 Matthias Clasen 2006-04-12 18:21:17 UTC

This goes in the right direction. I'd like to see some expansions here

- I think we should allow multiple subparsers, and have a map 
  tagname -> subparser. We obviously need a stack then to keep track
  of installed subparsers.
- Why does set_sub() take a context, and not just a GMarkupParser ? 
- I think set_sub() should handle overwriting (ie free an existing subparser
  first) and also allow resetting a subparser by accepting NULL

Comment 7 Johan (not receiving bugmail) Dahlin 2006-04-12 21:38:12 UTC

> - I think we should allow multiple subparsers, and have a map 
>   tagname -> subparser. We obviously need a stack then to keep track
>   of installed subparsers.

Do you want nested or multiple?
If multiple, the user is required to know in which the order of the tags show up, but not allow for nested. And if nested, there's no way of specifying a parser for tags at the same level (<a>..sub1..</a><b>..sub2..</b>).
It's quickly getting very complicated.

> - Why does set_sub() take a context, and not just a GMarkupParser ? 

It was a way of supporting nested parsers.

> - I think set_sub() should handle overwriting (ie free an existing subparser
>   first) and also allow resetting a subparser by accepting NULL

Good point, I'll address this.

Comment 8 Matthias Clasen 2006-04-13 03:54:12 UTC

I think a simple map + stack would work ok for both multiple and nesting.
The way I imagine it would work is as follows. Say you have three subparsers,
named A, B, and C, and a map as follows:

a -> A
b -> B
c -> C

then during the parsing of <a></a><b><b></b><c/></b> the stack of subparsers would 
look as follows:

1) A
2) B
3) B,B
4) C,B
5) B

ie you just blindly push and pop subparsers as they match the start and end
tags you see, and live with the fact that you may have the same parser repeatedly on the stack (this subparser idea is not really suitable for recursive elements anyway)

Comment 9 Johan (not receiving bugmail) Dahlin 2006-04-17 21:40:11 UTC

Created attachment 63742 [details] [review]
v3: use a map of stacks to store sub parsers

Comment 10 Matthias Clasen 2006-04-19 02:07:31 UTC

Created attachment 63840 [details]
my idea

Comment 11 Matthias Clasen 2006-04-19 02:11:09 UTC

I have attached a quick implementation of what I had in mind. 
The code shows that the nesting feature does not need any new
GMarkup api at all, and can easily be handled by the app - or
by GtkBuilder in this case. For GtkBuilder, the approach could
even be much simpler, since you are only interested in a single
tag, and don't need nesting/multiple subparsers at all.

I hope you are not upset that I made you write 3 different versions
of "nestable parsers" only to make up my mind and not want the feature
in GMarkup after all...

Comment 12 Johan (not receiving bugmail) Dahlin 2006-04-19 21:47:22 UTC

It's fine, I just wish you could have told me a bit earlier. Feel free to WONTFIX this bug if you want.

I know it's possible to do this entierly in the application, however the solution is not really as nice as having some limited support in GMarkup itself.
I don't really think it'll be necessary to do the whole map + stack thing for GtkBuilder, I'm a little worried about performance there for instance.

Comment 13 Matthias Clasen 2007-11-08 14:25:26 UTC

Since GtkBuilder is doing fine without this now, lets close this.

Comment 14 Allison Karlitskaya (desrt) 2008-02-11 08:10:16 UTC

after some thinking about this, i'd like to reopen the bug.

gvariant will want to have a markup subparser that various applications can invoke to parse an xml-formatted gvariant value.

these applications are going to want to have a very simple api to set this into motion.

see the attached test case for an example of how i plan to make use of this.

i've also given an example of a "replay" style handler (again, in the testcase) that passes the opening tag through to the subparser since this is more suitable for the sort of thing GtkBuilder wants to do...

cheers

Comment 15 Allison Karlitskaya (desrt) 2008-02-11 08:11:37 UTC

Created attachment 104898 [details] [review]
add some subparser support

Comment 16 Johan (not receiving bugmail) Dahlin 2008-06-02 15:52:35 UTC

This would be nice to have in GtkBuilder, to be able to remove some of the trickier parts of subparser support used there.

Comment 17 Havoc Pennington 2008-06-02 21:22:19 UTC

Comment on attachment 104898 [details] [review]
add some subparser support

My ultra-lame patch review comments:

* could cluster awaiting_pop with the other 1-bit fields and make it 1 bit

* the return value for pop() is not really documented (i.e. add "returns the user data passed to push")

Comment 18 Matthias Clasen 2008-07-03 04:36:48 UTC

Some more review (I agree with havocs points):


Is "awaiting_pop" any different from "held_user_data != NULL" ?


-                                 "to end the start tag of element '%s'"),
+                                 "to end the end tag of element '%s'"),

This is an independent bug fix, and should be committed separately.


The doc comments need Since tags.


The docs you wrote for the public api go to some length to explain how these functions must be used. But the text remains somewhat cryptic ("to implement a higher-level interface"). I strongly suggest that we need an example to show 
this in action.


The docs for push should mention pop as a way to free the user_data, where the user_data is discussed. The docs for push should probably also mention how
error handling works - do I still have to pop if an error occurred ?


+ * This call exists to collect the user_data allocated by a matching

s/call/function/



It would be nice to move the test to glib/tests and use our fancy new test framework instead of homegrown.

Comment 19 Allison Karlitskaya (desrt) 2008-07-05 08:51:26 UTC

Created attachment 114013 [details] [review]
changes as requested

Thanks for the review.

For the awaiting_pop thing: consider the most common value used for user_data: NULL.  I think in this case it still makes sense to have matching push/pop calls for consistency (ie: some rule about "you must not call pop if you gave NULL for user_data" would be fairly insane).

Comment 20 Matthias Clasen 2008-07-06 00:54:54 UTC

+  guint awaiting_pop;

You lost a : 1 here, I think.


 * <programlisting>

I've started to use the new gtk-doc shorthand |[ ]| for examples in doc comments.


 *   // else, handle other tags...

And please no C99 comments in examples.



With these changes, it looks good to commit. 

We should make GtkBuilder use it, too...

Comment 21 Allison Karlitskaya (desrt) 2008-07-06 07:31:33 UTC

Before I commit: how do I nest /* style */ comments inside of the documentation comment?  (this is why I used C99 style...)

Comment 22 Matthias Clasen 2008-07-07 14:26:21 UTC

oh, right. You can do  

/&ast;  foo  &ast;/

not exactly beautiful, I know...

Comment 23 Allison Karlitskaya (desrt) 2008-07-10 08:44:51 UTC

Created attachment 114291 [details] [review]
revised

Comment 24 Allison Karlitskaya (desrt) 2008-07-10 08:45:44 UTC

Committed revision 7174.

thanks