GNOME Bugzilla – Bug 337518
GMarkup: Subparser support
Last modified: 2008-07-10 08:45:44 UTC
GMarkup should provide a way of forcing passthrough so CDATA can be avoided to send fragments of a markup to another parser. Main use case is GtkBuilder where everything between <ui>..</ui> should be sent to the parser in GtkUIManager.
Created attachment 62866 [details] [review] patch The patch is designed for the following use case: 1. force_passthrough is set in end_element 2. force_passthrough is unset in the passthrough handler If the idea is not rejected, I'll write proper documentation.
Comment on attachment 62866 [details] [review] patch >Index: glib/gmarkup.c >+void >+g_markup_parse_context_force_passthrough (GMarkupParseContext *context, >+ gboolean passthrough, >+ gboolean remove_last) >+{ >+ f (passthrough) Typo, ignore this. >+ { >+ if (remove_last) >+ { >+ g_free (context->tag_stack->data); >+ context->tag_stack = g_slist_delete_link (context->tag_stack, >+ context->tag_stack); I just realized that this tag should probably be sent to the passthrough handler, aswell.
Can you give me a more complete example of how this would be used ? I am not sure I get it. So, in the start handler for <ui>, you set force-passthrough, and then everything up to </ui> gets pushed through the passthrough handler ? I am not sure that I like that idea. Maybe it would be nicer to stack parsers somehow, ie add some way to say: "when you meet a <ui> element, switch to parser foo" and then automatically pop the parser again when it is done parsing the </ui>. Not sure if thats feasible, though.
> Can you give me a more complete example of how this would be used ? start_handler: if element == "ui" force_passthrough(TRUE) passthrough_handler: if current_tag == "ui": if data.endswith("</ui>"): force_passthrough(FALSE) uimanager.add_ui_from_string (stored.data) else: stored.data += data > So, in the start handler for <ui>, you set force-passthrough, and then > everything up to </ui> gets pushed through the passthrough handler ? Correct. > I am not sure that I like that idea. Maybe it would be nicer to stack parsers > somehow, ie add some way to say: "when you meet a <ui> element, switch to > parser foo" and then automatically pop the parser again when it is done > parsing the </ui>. Not sure if thats feasible, though. I can have a look at that, but I wanted to try out this approach first, since it seemed simpler to implement.
Created attachment 63110 [details] [review] sub contexts This is another approach, it adds an api for assigning a "sub" context for a specific element. Only one sub context is supported and the original context will be restored when the end of the element is reached.
This goes in the right direction. I'd like to see some expansions here - I think we should allow multiple subparsers, and have a map tagname -> subparser. We obviously need a stack then to keep track of installed subparsers. - Why does set_sub() take a context, and not just a GMarkupParser ? - I think set_sub() should handle overwriting (ie free an existing subparser first) and also allow resetting a subparser by accepting NULL
> - I think we should allow multiple subparsers, and have a map > tagname -> subparser. We obviously need a stack then to keep track > of installed subparsers. Do you want nested or multiple? If multiple, the user is required to know in which the order of the tags show up, but not allow for nested. And if nested, there's no way of specifying a parser for tags at the same level (<a>..sub1..</a><b>..sub2..</b>). It's quickly getting very complicated. > - Why does set_sub() take a context, and not just a GMarkupParser ? It was a way of supporting nested parsers. > - I think set_sub() should handle overwriting (ie free an existing subparser > first) and also allow resetting a subparser by accepting NULL Good point, I'll address this.
I think a simple map + stack would work ok for both multiple and nesting. The way I imagine it would work is as follows. Say you have three subparsers, named A, B, and C, and a map as follows: a -> A b -> B c -> C then during the parsing of <a></a><b><b></b><c/></b> the stack of subparsers would look as follows: 1) A 2) B 3) B,B 4) C,B 5) B ie you just blindly push and pop subparsers as they match the start and end tags you see, and live with the fact that you may have the same parser repeatedly on the stack (this subparser idea is not really suitable for recursive elements anyway)
Created attachment 63742 [details] [review] v3: use a map of stacks to store sub parsers
Created attachment 63840 [details] my idea
I have attached a quick implementation of what I had in mind. The code shows that the nesting feature does not need any new GMarkup api at all, and can easily be handled by the app - or by GtkBuilder in this case. For GtkBuilder, the approach could even be much simpler, since you are only interested in a single tag, and don't need nesting/multiple subparsers at all. I hope you are not upset that I made you write 3 different versions of "nestable parsers" only to make up my mind and not want the feature in GMarkup after all...
It's fine, I just wish you could have told me a bit earlier. Feel free to WONTFIX this bug if you want. I know it's possible to do this entierly in the application, however the solution is not really as nice as having some limited support in GMarkup itself. I don't really think it'll be necessary to do the whole map + stack thing for GtkBuilder, I'm a little worried about performance there for instance.
Since GtkBuilder is doing fine without this now, lets close this.
after some thinking about this, i'd like to reopen the bug. gvariant will want to have a markup subparser that various applications can invoke to parse an xml-formatted gvariant value. these applications are going to want to have a very simple api to set this into motion. see the attached test case for an example of how i plan to make use of this. i've also given an example of a "replay" style handler (again, in the testcase) that passes the opening tag through to the subparser since this is more suitable for the sort of thing GtkBuilder wants to do... cheers
Created attachment 104898 [details] [review] add some subparser support
This would be nice to have in GtkBuilder, to be able to remove some of the trickier parts of subparser support used there.
Comment on attachment 104898 [details] [review] add some subparser support My ultra-lame patch review comments: * could cluster awaiting_pop with the other 1-bit fields and make it 1 bit * the return value for pop() is not really documented (i.e. add "returns the user data passed to push")
Some more review (I agree with havocs points): Is "awaiting_pop" any different from "held_user_data != NULL" ? - "to end the start tag of element '%s'"), + "to end the end tag of element '%s'"), This is an independent bug fix, and should be committed separately. The doc comments need Since tags. The docs you wrote for the public api go to some length to explain how these functions must be used. But the text remains somewhat cryptic ("to implement a higher-level interface"). I strongly suggest that we need an example to show this in action. The docs for push should mention pop as a way to free the user_data, where the user_data is discussed. The docs for push should probably also mention how error handling works - do I still have to pop if an error occurred ? + * This call exists to collect the user_data allocated by a matching s/call/function/ It would be nice to move the test to glib/tests and use our fancy new test framework instead of homegrown.
Created attachment 114013 [details] [review] changes as requested Thanks for the review. For the awaiting_pop thing: consider the most common value used for user_data: NULL. I think in this case it still makes sense to have matching push/pop calls for consistency (ie: some rule about "you must not call pop if you gave NULL for user_data" would be fairly insane).
+ guint awaiting_pop; You lost a : 1 here, I think. * <programlisting> I've started to use the new gtk-doc shorthand |[ ]| for examples in doc comments. * // else, handle other tags... And please no C99 comments in examples. With these changes, it looks good to commit. We should make GtkBuilder use it, too...
Before I commit: how do I nest /* style */ comments inside of the documentation comment? (this is why I used C99 style...)
oh, right. You can do /* foo */ not exactly beautiful, I know...
Created attachment 114291 [details] [review] revised
Committed revision 7174. thanks