GNOME Bugzilla – Bug 93597
GMarkup should parse plain text not just beginning-with-tag
Last modified: 2004-12-22 21:47:04 UTC
Package: glib Severity: major Version: GNOME2.0.1 2.0.6 Synopsis: GMarkup should parse plain text not just beginning-with-tag Bugzilla-Product: glib Bugzilla-Component: general Description: GMarkupParseContext returns TRUE with not-well-formed serial of tags, like: <font color=xxx><font size=2>plain_text but there's a missing for 'text' event of "plain_text". and returns FALSE with just plain text like: another_plain_text without any event call. I think gmarkup should parse plain text even if there's no tag. if the string ends with a plain text and without closing tag, gmarkup have to call 'text' event for the plain text, like "plain_text" of first one. GMarkup is yet another SAX parser, not DOM for XML or another subset. and SAX has no rule for tag-pairing or something; it just call start_elem, end_elem, text, etc. for each case of event. gmarkup will be more useful if it resolv plain text kindly. ------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-09-18 15:27 ------- Reassigning to the default owner of the component, gtkdev@gtk.org.
I think it's very valuable to keep GMarkup as a strict XML subset. No conforming XMLL parser can accept unpaire begin-end tags. Yes, some "XML" SAX parsers can be abused in this fashion, but that doesn't mean that its a correct usage. If you want to parse, say, HTML, use an HTML parser. (I'll let Havoc make the final decision, but, IMO, its' a WONTFIX.)
AFAI understand, there's no definitive reason for GMarkup to be strict as full-scale XML. the reference document describe GMarkup is simplification of XML and this mean that it's just minimal set of Markup Language; using opening/closing-tag, name=value pair for attributes, etc.
I don't think it's valid to rely on gmarkup handling this text. I think it would be fine if gmarkup happened to call the text handler prior to returning an error, but don't consider it a bug that it doesn't. If gmarkup returns an error you have a malformed document.
I've told another local hacker on this post. he said that he does agree with me; there's full potential if not-static parsing is allowed. what about to be optional?
Gmarkup is only useful if it stays a subset of XML. The world has enough homegrown pseudo-XML parsers already.
why should GMarkup be a strict XML parser?
a) this ensures that you can use other XML tools to process the same data b) it is documented as such
a) reference say: "It should not be used if you expect to interoperate with other applications generating full-scale XML." b) a) is negative to b). and I think b) possibly changed if a) is true.
A) You can parse GMarkup data files with an XML parser B) You can parse any XML file with GMarkup Are *different*. Just because B) isn't true doesn't mean that we should break A). Plus, we might eventually want to replace the GMarkup internals with a real XML parser.
The URL field has been removed from bugzilla.gnome.org. This URL was in the old URL field, and is being added as a comment so that the data is not lost. Please email bugmaster@gnome.org if you have any questions. URL: http://mail.gnome.org/archives/gtk-app-devel-list/2002-September/msg00261.html