After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 93597 - GMarkup should parse plain text not just beginning-with-tag
GMarkup should parse plain text not just beginning-with-tag
Status: RESOLVED WONTFIX
Product: glib
Classification: Platform
Component: general
unspecified
Other other
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2002-09-18 19:33 UTC by kz
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description kz 2002-09-18 19:27:08 UTC
Package: glib
Severity: major
Version: GNOME2.0.1 2.0.6
Synopsis: GMarkup should parse plain text not just beginning-with-tag
Bugzilla-Product: glib
Bugzilla-Component: general
Description:
GMarkupParseContext returns TRUE with not-well-formed serial of tags,
like:
<font color=xxx><font size=2>plain_text
but there's a missing for 'text' event of "plain_text".

and returns FALSE with just plain text like:
another_plain_text
without any event call.

I think gmarkup should parse plain text even if there's no tag.
if the string ends with a plain text and without closing tag,
gmarkup have to call 'text' event for the plain text, like "plain_text"
of first one.

GMarkup is yet another SAX parser, not DOM for XML or another subset.
and SAX has no rule for tag-pairing or something;
it just call start_elem, end_elem, text, etc. for each case of event.

gmarkup will be more useful if it resolv plain text kindly.




------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-09-18 15:27 -------

Reassigning to the default owner of the component, gtkdev@gtk.org.

Comment 1 Owen Taylor 2002-09-19 16:58:47 UTC
I think it's very valuable to keep GMarkup as a strict XML
subset. No conforming XMLL parser can accept unpaire
begin-end tags.

Yes, some "XML" SAX parsers can be abused in this fashion, 
but that doesn't mean that its a correct usage.

If you want to parse, say, HTML, use an HTML parser.

(I'll let Havoc make the final decision, but, IMO,
its' a WONTFIX.)
Comment 2 kz 2002-09-19 17:14:48 UTC
AFAI understand, there's no definitive reason for GMarkup to be strict
as full-scale XML. the reference document describe GMarkup is
simplification of XML and this mean that it's just minimal set of
Markup Language; using opening/closing-tag, name=value pair for
attributes, etc.
Comment 3 Havoc Pennington 2002-09-19 18:41:09 UTC
I don't think it's valid to rely on gmarkup handling this text.
I think it would be fine if gmarkup happened to call the text handler
prior to returning an error, but don't consider it a bug that it
doesn't. If gmarkup returns an error you have a malformed document.
Comment 4 kz 2002-09-19 18:49:13 UTC
I've told another local hacker on this post.
he said that he does agree with me; there's full potential if
not-static parsing is allowed.
what about to be optional?
Comment 5 Matthias Clasen 2002-09-19 18:58:59 UTC
Gmarkup is only useful if it stays a subset of XML. 
The world has enough homegrown pseudo-XML parsers already.
Comment 6 kz 2002-09-19 19:03:46 UTC
why should GMarkup be a strict XML parser?
Comment 7 Matthias Clasen 2002-09-19 19:19:35 UTC
a) this ensures that you can use other XML tools to process
the same data

b) it is documented as such
Comment 8 kz 2002-09-19 19:49:08 UTC
a) reference say: "It should not be used if you expect to interoperate
with other applications generating full-scale XML."

b) a) is negative to b). and I think b) possibly changed if a) is true.
Comment 9 Owen Taylor 2002-09-19 20:38:34 UTC
A) You can parse GMarkup data files with an XML parser
B) You can parse any XML file with GMarkup 

Are *different*. Just because B) isn't true doesn't mean
that we should break A). Plus, we might eventually want to 
replace the GMarkup internals with a real XML parser.


Comment 10 Bugzilla Maintainers 2004-04-01 23:44:57 UTC
The URL field has been removed from bugzilla.gnome.org. This URL was in the old URL field, and is being added as a comment so that the data is not lost. Please email bugmaster@gnome.org if you have any questions.

URL: 
http://mail.gnome.org/archives/gtk-app-devel-list/2002-September/msg00261.html