After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 386558 - ASX parser shouldn't use libxml
ASX parser shouldn't use libxml
Status: RESOLVED FIXED
Product: totem-pl-parser
Classification: Core
Component: General
Old
Other Linux
: Normal normal
: ---
Assigned To: General Totem maintainer(s)
General Totem maintainer(s)
Depends on:
Blocks: 361531
 
 
Reported: 2006-12-16 17:27 UTC by Reinout van Schouwen
Modified: 2008-06-03 17:40 UTC
See Also:
GNOME target: ---
GNOME version: 2.17/2.18


Attachments
first attempt to replace libxml (22.63 KB, application/x-gzip)
2007-01-07 20:26 UTC, Patrick Ammann
  Details
totem-use-xine-xml-parser.patch (35.93 KB, patch)
2007-01-17 11:39 UTC, Bastien Nocera
none Details | Review
xine-lib-xml-parser-copy.patch (2.10 KB, patch)
2007-01-17 11:40 UTC, Bastien Nocera
none Details | Review
test.asx (1.01 KB, text/plain)
2007-01-17 11:42 UTC, Bastien Nocera
  Details
totem-use-xine-xml-parser-2.patch (36.89 KB, patch)
2007-01-17 12:30 UTC, Bastien Nocera
none Details | Review
xine-lib-xml-parser-copy-2.patch (3.37 KB, patch)
2007-01-17 12:32 UTC, Bastien Nocera
none Details | Review
totem-use-xine-xml-parser-3.patch (40.19 KB, patch)
2007-01-18 00:02 UTC, Bastien Nocera
none Details | Review

Description Reinout van Schouwen 2006-12-16 17:27:38 UTC
Go to http://www.radio2.nl/page/live and select either the 'smalband' or 'breedband' link.

Totem will try to play the stream but fails with a "the connection with the server is refused" error.

This seems to be caused by the fact that the xml file is invalid; the URL in the playlist file is http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a=.asx while the plparser thinks it is http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx
Comment 1 Christian Persch 2006-12-16 17:29:50 UTC
Confirming.

This is the playlist:
<ASX version = "3.0">
        <Title>Radio 2</Title>
                <ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a=.asx" />
                <ENTRYREF href = "http://www.omroep.nl/live/radio2-breed.asx" />
</ASX>

which triggers these warnings:
Entity: line 3: parser error : EntityRef: expecting ';'
<ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a
                                                                               ^
Entity: line 3: parser error : EntityRef: expecting ';'
<ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a
                                                                               ^
added URI 'http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx' with title 'Radio 2' genre '(null)'
added URI 'http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx' with title 'Radio 2' genre '(null)'

We really shouldn't try parsing invalid 'xml' with libxml...
Comment 2 Bastien Nocera 2006-12-19 00:00:54 UTC
Yeah, ASX is a crappy 'XML' format.
Comment 3 Sebastien Bacher 2006-12-26 13:38:48 UTC
Ubuntu bug about that: https://launchpad.net/products/totem/+bug/70706
Comment 4 Patrick Ammann 2007-01-07 20:26:21 UTC
Created attachment 79676 [details]
first attempt to replace libxml

- replace libxml with xml parsing functions taken from xine-lib
- there is still one TODO
- i did only run a limited test for .asx others are not tested
  like siml, quicktime, etc..
Comment 5 Bastien Nocera 2007-01-07 22:46:26 UTC
Good work, but I'd really rather it used GMarkup for the XML parsing. This would avoid code duplication. Moreover, xine-lib is GPL, whereas the totem playlist parser is LGPL.
Comment 6 Christian Persch 2007-01-15 23:08:48 UTC
Another thing the new parser should tolerate is tags case mismatch, like on the playlists on http://www.pixoff.net/ :

Entity: line 8: parser error : Opening and ending tag mismatch: ASX line 1 and Asx
</Asx>
      ^
Comment 7 Christian Persch 2007-01-16 17:10:45 UTC
I don't think you can use GMarkup either. The docs say:

GMarkup is not guaranteed to signal an error on all invalid XML; the parser may accept documents that an XML parser would not. However, XML documents which are not well-formed[4] are not considered valid GMarkup documents.

and also that "Only UTF-8 encoding is allowed.".
Comment 8 Bastien Nocera 2007-01-17 00:00:02 UTC
Patrick, if you can do the leg work to get this xine-lib code relicensed under the LPGL, I'd be ready to include it in totem.
Comment 9 Patrick Ammann 2007-01-17 09:06:24 UTC
I will politely ask the xine folk.
Comment 10 Bastien Nocera 2007-01-17 11:39:43 UTC
Created attachment 80493 [details] [review]
totem-use-xine-xml-parser.patch

- Use straight copies of the xine-lib code (patch below against xine-lib CVS HEAD)
- Only make ASX use the xine-lib code (the other types are much more well behaved)
- Made it case-unsensitive
Comment 11 Bastien Nocera 2007-01-17 11:40:56 UTC
Created attachment 80494 [details] [review]
xine-lib-xml-parser-copy.patch

Patch to xine-lib to avoid errors when compiling the parser out-of-tree
Comment 12 Bastien Nocera 2007-01-17 11:42:16 UTC
Created attachment 80495 [details]
test.asx

A test file with:
- HTML comments
- Case-mismatched tags
- broken escaping of URLs
Comment 13 Bastien Nocera 2007-01-17 12:30:55 UTC
Created attachment 80497 [details] [review]
totem-use-xine-xml-parser-2.patch

Updated patch, xml_parser_get_property returns data directly from the tree, so don't free it. Mark more strings as const.
Comment 14 Bastien Nocera 2007-01-17 12:32:29 UTC
Created attachment 80498 [details] [review]
xine-lib-xml-parser-copy-2.patch

Mark the return of xml_parser_get_property as const, so people don't try to free it.
Comment 15 Patrick Ammann 2007-01-17 19:32:00 UTC
- i would rename all exported functions, as otherwise you may
  link to a function defined in xine-lib, in case you link
  with the xine-lib.

- in case we would use those xml parsing functions for all
  playlists we could drop the libxml dependency for totem.
Comment 16 Bastien Nocera 2007-01-17 19:36:04 UTC
(In reply to comment #15)
> - i would rename all exported functions, as otherwise you may
>   link to a function defined in xine-lib, in case you link
>   with the xine-lib.

We should actually only export the functions that are public in libtotem-plparser.so. That's easy enough to fix.

> - in case we would use those xml parsing functions for all
>   playlists we could drop the libxml dependency for totem.

gnome-vfs relies on libxml, so does libglade, so it wouldn't be gone yet. I also trust libxml2 more than xine-lib's xml parser for proper XML files.
Comment 17 Bastien Nocera 2007-01-18 00:02:09 UTC
Created attachment 80558 [details] [review]
totem-use-xine-xml-parser-3.patch

- Restrict the exported symbols to the plparser public API
Comment 18 Bastien Nocera 2007-03-04 16:25:47 UTC
2007-03-04  Bastien Nocera  <hadess@hadess.net>

        * src/plparse/test-parser.c: (main):
        Call g_thread_init() very early to avoid the GSlice warning

        * src/plparse/xmllexer.c:
        * src/plparse/xmllexer.h:
        * src/plparse/xmlparser.c:
        * src/plparse/xmlparser.h: Add a copy of the xine-lib XML parser
        under the LGPL

        * src/plparse/Makefile.am:
        * src/plparse/totem-pl-parser-wm.c: (parse_asx_entry),
        (parse_asx_entries), (totem_pl_parser_add_asx):
        * src/plparse/totem-pl-parser.c:
        Use the xine-lib XML parser copy, to allow:
        - case mismatches
        - HTML comments
        - broken escaping in URLs

        (Closes: #386558)
Comment 19 Philip Withnall 2008-06-03 17:40:01 UTC
Mass-move from totem to totem-pl-parser. You can remove all messages by searching for this comment.