GNOME Bugzilla – Bug 665634
g_dbus_node_info_new_for_xml() errors on unknown attributes in XML files
Last modified: 2013-10-29 16:37:29 UTC
Running g_dbus_node_info_new_for_xml() on a D-Bus introspection XML file from the Telepathy specification gives errors such as: attribute 'tp:name-for-bindings' invalid for element 'method' The Telepathy specification files are all valid XML (since the extra attributes are namespaced correctly), so g_dbus_node_info_new_for_xml() should parse them correctly. Looking at gdbusintrospection.c, the offender is g_markup_collect_attributes() which errors when it encounters an unexpected attribute. I see several possible solutions, and none of them feel great: 1. Hackily add “tp:type” and “tp:name-for-bindings” to the list of attributes passed to g_markup_collect_attributes() in gdbusintrospection.c. This will work around the problem for the Telepathy specification, but is definitely a hack. 2. Stop using g_markup_collect_attributes() in gdbusintrospection.c, and manually inspect the attribute_names and attribute_values string arrays instead. This will end up duplicating a lot of the code in g_markup_collect_attributes(). 3. Add a new g_markup_collect_known_attributes() variant of g_markup_collect_attributes() which doesn't error when it encounters an unknown attribute, and use that in gdbusintrospection.c instead. This feels like the best solution to me. Any thoughts? If not, I'll try and put together a patch implementing (3).
(In reply to comment #0) > 3. Add a new g_markup_collect_known_attributes() variant of > g_markup_collect_attributes() which doesn't error when it encounters an unknown > attribute, and use that in gdbusintrospection.c instead. This feels like the > best solution to me. > > Any thoughts? If not, I'll try and put together a patch implementing (3). Yup, I agree extending GMarkup is the way forward.
For a number of reasons, I'm not sure I totally agree: 1) GMarkup isn't XML 2) Even if it were, this sort of attribute adding is supposed to be done in a different xml namespace (which we don't support).
Being wellformed XML is not the same thing as being valid according to a certain document type. I don't think we need to eat any kind of wellformed xml that might be interpretable as introspection xml.
(In reply to comment #2) > For a number of reasons, I'm not sure I totally agree: > > 1) GMarkup isn't XML By the same reasoning, then, GMarkup shouldn't be used to parse introspection XML because introspection XML _is_ XML. :-( > 2) Even if it were, this sort of attribute adding is supposed to be done in a > different xml namespace (which we don't support). To a first approximation, we could just ignore unknown attributes whose names contain a colon, then.
Note that gdbus-codegen handles correctly the telepathy spec by ignoring tags in the tp namespace.
(In reply to comment #5) > Note that gdbus-codegen handles correctly the telepathy spec by ignoring tags > in the tp namespace. That's because gdbus-codegen(1) is using one of the Python XML parsers. My view on the whole thing is rather simple - If it's not too much effort making GMarkup accept even more valid XML then we should make it. Let's be practical about it. - It's bad form if an app returns XML in Introspectable.Introspect() that has a lot of extra stuff that is not in the D-Bus spec. It's simply not the place for the app to return such information. Now, that's not to say you can't extend Introspection XML in your app - but if you do, please ensure your D-Bus binding is sanitizing if using it in Introspectable.Introspect(). (Yes, this is debatable and I'm sure some people will disagree. But at least it's my view.) - We actually don't use the D-Bus introspection XML parsing routines anywhere except for completion in the gdbus(1) tool (we do use the data structures to be able to build and return some XML). We should probably add a comment to the docs for g_dbus_node_info_new_for_xml() saying that it's using a GMarkup-based parser and as such may not parse all valid XML.
(In reply to comment #6) > - It's bad form if an app returns XML in Introspectable.Introspect() > that has a lot of extra stuff that is not in the D-Bus spec. It's simply > not the place for the app to return such information. > > Now, that's not to say you can't extend Introspection XML in your app - but > if you do, please ensure your D-Bus binding is sanitizing if using it in > Introspectable.Introspect(). Forgot to mention that gdbus-codegen(1) takes this approach - e.g. the XML you pass to gdbus-codegen(1) is not the same as returned in Introspectable.Introspect() - the XML returned is much simpler.
(In reply to comment #6) > - We actually don't use the D-Bus introspection XML parsing routines anywhere > except for completion in the gdbus(1) tool (we do use the data structures > to be able to build and return some XML). We should probably add a comment > to the docs for g_dbus_node_info_new_for_xml() saying that it's > using a GMarkup-based parser and as such may not parse all valid XML. I have added a comment, see http://git.gnome.org/browse/glib/commit/?id=990af4b3725faba230abc6c2e68c112da6f13b41
(In reply to comment #6) > My view on the whole thing is rather simple > > - If it's not too much effort making GMarkup accept even more valid XML > then we should make it. Let's be practical about it. This should be fairly simple, but I don't want to put a patch together until there's some sort of consensus on whether it's the right approach. As I've said, I think it is; and I think you've articulated my opinion better than I did.
Created attachment 205047 [details] [review] gmarkup: Add g_markup_collect_known_attributes() to ignore unknown attributes This patch takes approach #3. It adds a new g_markup_collect_known_attributes() (which shares almost all of its code with the original g_markup_collect_attributes()), and modifies the markup-collect test to check both functions. A second commit modifies the GDBus introspection code to use g_markup_collect_known_attributes().
I've reverted the patches for now. I'm almost certain I prefer a better solution to this problem (ie: fixing the other function to do the right thing). I think the right thing is to ignore attributes if they have ':' in the name, but it's clear that we need to have a discussion on this topic...
(In reply to comment #11) > I'm almost certain I prefer a better solution to this problem (ie: fixing the > other function to do the right thing). Care to elaborate? Note that in any case, the GDBus introspection code should probably ignore unrecognised attributes (whichever namespace they’re in) just for forwards-compatibility reasons.
My current line of thinking is that we always ignore (unknown) attributes with ':' in the name and provide an option to ignore all unknown attributes.
Created attachment 255039 [details] [review] Updated patch against current git I am not the original submitter, but the issue affects me as well, and I'd like to get the proposed patch accepted into glib. I am attaching a version of the patch updated against the current git. The changes to the original version are: * Don't touch glib/glib.symbols, as it's no longer there. * Put GLIB_AVAILABLE_IN_ALL into the declaration of g_markup_collect_known_attributes().
Created attachment 258260 [details] [review] GMarkup collect: allow unknown attributes with ':' If we have an unknown attribute with ':' in the name then don't complain. This will allow people to extend things like GSettings schemas and DBus introspection XML with custom data. I think it's probably sufficient to just always allow extra attributes with ':' in the name -- the intention of the user is pretty clear here.
Review of attachment 258260 [details] [review]: This looks fine to me, but note that it doesn’t address the issue of forwards-compatibility, where parsing will still fail on unknown attributes in the default namespace if an older parser is used to parse a newer document (which has new, extra attributes). ::: glib/gmarkup.c @@ +2816,3 @@ if (i == j) + { + /* Silently ignore attributes with ':' in the name */ It would be helpful to expand this comment to say _why_ unknown attributes are ignored. e.g. “To allow unrecognised and custom namespaces to be used.”
I actually have a better idea... Instead of shoving this down everyone's throats without asking (since we don't have a flags API here) we could add a GMarkupParseFlags entry for "don't report attributes with : in the name". I think I'd prefer that, because it would be strictly opt-in (and could also be useful to people who do not use the collect API).
Created attachment 258349 [details] [review] GMarkup: share some common code for closing tags The code for dealing with </foo> and the second half of <foo/> was largely duplicated. We can share a lot of it by using a common function. This slightly changes the behaviour of the parser under error circumstances: previously the parser would deal with '<foo/}' by first issuing the end_element callback and then flagging the error due to the unexpected character. Now we will flag the unexpected character error first, skipping the callback. This behaviour change required modifying the testsuite.
Created attachment 258350 [details] [review] GMarkup: add G_MARKUP_IGNORE_QUALIFIED Add a flag to GMarkupParserFlags to ignore qualified tags (along with their contents) and attributes. This will provide a nice way for some of our parsers (GDBus introspection, GSettings schema, etc) to ignore additional tags that users have added to their files, under a different namespace.
Created attachment 258351 [details] [review] tests: add a ignore-qualified markup-collect case Add a case to markup-collect that exercises the new IGNORE_QUALIFIED flag.
Review of attachment 258349 [details] [review]: Looks fine to me. Thanks for running make check proactively !
Review of attachment 258350 [details] [review]: ::: glib/gmarkup.c @@ +1019,3 @@ + if (context->flags & G_MARKUP_IGNORE_QUALIFIED && strchr (current_element (context), ':')) + { + static const GMarkupParser ignore_parser; Does the static serve any purpose here ? I fear it might cost us relocations.. although, given that this is all zeros, maybe not.
Review of attachment 258351 [details] [review]: ::: glib/tests/markup-collect.c @@ +101,1 @@ { "<bool ob='y'/>", "<bool(0) 0 0 -1>", Would be good to have a testcase with character data and non-qualified tags inside an ignored element, too.
Attachment 258349 [details] pushed as cbccbae - GMarkup: share some common code for closing tags Attachment 258350 [details] pushed as 474d915 - GMarkup: add G_MARKUP_IGNORE_QUALIFIED Attachment 258351 [details] pushed as 1bc9883 - tests: add a ignore-qualified markup-collect case
Added an extra testcase with nested tags and text. There is no relocation here because no pointers are being resolved and it does have to be static because that's how GMarkupParsers work... GDBus patch coming soon.
Created attachment 258357 [details] [review] GDBus: ignore qualified XML tags and attributes Ignore qualified (in the XML namespace sense) tags and attributes when parsing D-Bus introspection XML. This will allow people to add custom tags and attributes to their D-Bus interfaces without tripping up GDBus.
Attachment 258357 [details] pushed as 9933a9f - GDBus: ignore qualified XML tags and attributes
This commit causes PackageKit to segfault on startup. Looks like the parser isn't handling tested qualified tags correctly.
Created attachment 258445 [details] [review] 0001-tests-gdbus-introspection-Add-a-less-trivial-test-ca.patch
Created attachment 258452 [details] [review] GMarkup: clear attributes on ignorned tags Make sure that if we ignore a tag then we also clear the attributes that we already collected so that they don't end up on the next unignored tag opening. Also add some extra brackets for clarity (it doesn't make any difference -- I just think it reads nicer this way).
Review of attachment 258452 [details] [review]: Just a typo ignorned -> ignored in commit message. Can you push my test case after or squash it, up to you?
Attachment 258452 [details] pushed as daff84e - GMarkup: clear attributes on ignorned tags