GNOME Bugzilla – Bug 386558
ASX parser shouldn't use libxml
Last modified: 2008-06-03 17:40:01 UTC
Go to http://www.radio2.nl/page/live and select either the 'smalband' or 'breedband' link. Totem will try to play the stream but fails with a "the connection with the server is refused" error. This seems to be caused by the fact that the xml file is invalid; the URL in the playlist file is http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a=.asx while the plparser thinks it is http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx
Confirming. This is the playlist: <ASX version = "3.0"> <Title>Radio 2</Title> <ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a=.asx" /> <ENTRYREF href = "http://www.omroep.nl/live/radio2-breed.asx" /> </ASX> which triggers these warnings: Entity: line 3: parser error : EntityRef: expecting ';' <ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a ^ Entity: line 3: parser error : EntityRef: expecting ';' <ENTRYREF href = "http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44&a ^ added URI 'http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx' with title 'Radio 2' genre '(null)' added URI 'http://switch.streamgate.nl/cgi-bin/streamswitch?streamid=44=.asx' with title 'Radio 2' genre '(null)' We really shouldn't try parsing invalid 'xml' with libxml...
Yeah, ASX is a crappy 'XML' format.
Ubuntu bug about that: https://launchpad.net/products/totem/+bug/70706
Created attachment 79676 [details] first attempt to replace libxml - replace libxml with xml parsing functions taken from xine-lib - there is still one TODO - i did only run a limited test for .asx others are not tested like siml, quicktime, etc..
Good work, but I'd really rather it used GMarkup for the XML parsing. This would avoid code duplication. Moreover, xine-lib is GPL, whereas the totem playlist parser is LGPL.
Another thing the new parser should tolerate is tags case mismatch, like on the playlists on http://www.pixoff.net/ : Entity: line 8: parser error : Opening and ending tag mismatch: ASX line 1 and Asx </Asx> ^
I don't think you can use GMarkup either. The docs say: GMarkup is not guaranteed to signal an error on all invalid XML; the parser may accept documents that an XML parser would not. However, XML documents which are not well-formed[4] are not considered valid GMarkup documents. and also that "Only UTF-8 encoding is allowed.".
Patrick, if you can do the leg work to get this xine-lib code relicensed under the LPGL, I'd be ready to include it in totem.
I will politely ask the xine folk.
Created attachment 80493 [details] [review] totem-use-xine-xml-parser.patch - Use straight copies of the xine-lib code (patch below against xine-lib CVS HEAD) - Only make ASX use the xine-lib code (the other types are much more well behaved) - Made it case-unsensitive
Created attachment 80494 [details] [review] xine-lib-xml-parser-copy.patch Patch to xine-lib to avoid errors when compiling the parser out-of-tree
Created attachment 80495 [details] test.asx A test file with: - HTML comments - Case-mismatched tags - broken escaping of URLs
Created attachment 80497 [details] [review] totem-use-xine-xml-parser-2.patch Updated patch, xml_parser_get_property returns data directly from the tree, so don't free it. Mark more strings as const.
Created attachment 80498 [details] [review] xine-lib-xml-parser-copy-2.patch Mark the return of xml_parser_get_property as const, so people don't try to free it.
- i would rename all exported functions, as otherwise you may link to a function defined in xine-lib, in case you link with the xine-lib. - in case we would use those xml parsing functions for all playlists we could drop the libxml dependency for totem.
(In reply to comment #15) > - i would rename all exported functions, as otherwise you may > link to a function defined in xine-lib, in case you link > with the xine-lib. We should actually only export the functions that are public in libtotem-plparser.so. That's easy enough to fix. > - in case we would use those xml parsing functions for all > playlists we could drop the libxml dependency for totem. gnome-vfs relies on libxml, so does libglade, so it wouldn't be gone yet. I also trust libxml2 more than xine-lib's xml parser for proper XML files.
Created attachment 80558 [details] [review] totem-use-xine-xml-parser-3.patch - Restrict the exported symbols to the plparser public API
2007-03-04 Bastien Nocera <hadess@hadess.net> * src/plparse/test-parser.c: (main): Call g_thread_init() very early to avoid the GSlice warning * src/plparse/xmllexer.c: * src/plparse/xmllexer.h: * src/plparse/xmlparser.c: * src/plparse/xmlparser.h: Add a copy of the xine-lib XML parser under the LGPL * src/plparse/Makefile.am: * src/plparse/totem-pl-parser-wm.c: (parse_asx_entry), (parse_asx_entries), (totem_pl_parser_add_asx): * src/plparse/totem-pl-parser.c: Use the xine-lib XML parser copy, to allow: - case mismatches - HTML comments - broken escaping in URLs (Closes: #386558)
Mass-move from totem to totem-pl-parser. You can remove all messages by searching for this comment.