GNOME Bugzilla – Bug 141765
unexpected behaviour with gsf_xml_parser_context() and uncompressed files
Last modified: 2004-12-22 21:47:04 UTC
developer reports unexpected behaviour with using libgsf with uncompressed xml files, i.e. http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=111468 ---forwarded bugzilla report begins--- From Bugzilla Helper: User-Agent: Mozilla/5.0 Galeon/1.2.7 (X11; Linux ppc; U;) Gecko/20030130 Description of problem: Libgsf's gsf_xml_parser_context() function seems broken when used to open an uncompressed xml file. I am working on an application that does the following: 1. initializes the library 2. creates a new gsf input entity using foo=gsf_input_gnomevfs_new() 3. calls gsf_xml_parser_context(foo) 4. does some stuff The problem is that between step 3 and 4, the gsf input entity foo's cur_offset is set to 10. This is does even though no data is directly read by the application. Since cur_offset is used to calculate the amount of unread data, the result of this is that gsf_input_remaining returns an incorrect value and reading gets mixed up. gsf-libxml.c:gsf_xml_parser_context looks like this: gsf_xml_parser_context (GsfInput *input) { GsfInputGZip *gzip; g_return_val_if_fail (GSF_IS_INPUT (input), NULL); gzip = gsf_input_gzip_new (input, NULL); if (gzip != NULL) input = GSF_INPUT (gzip); else { gsf_input_seek(input, 0, G_SEEK_SET); g_object_ref (G_OBJECT (input)); } return xmlCreateIOParserCtxt ( NULL, NULL, (xmlInputReadCallback) gsf_libxml_read, (xmlInputCloseCallback) gsf_libxml_close, input, XML_CHAR_ENCODING_NONE); } It appears that "gzip = gsf_input_gzip_new (input, NULL);" is what causes the mixed up cur_offset value. This make sense because gsf_input_gzip_new would have to read a few bytes from input in order to determine if the file is valid gzip'ed data or not. So... gsf_xml_parser_context (GsfInput *input) { GsfInputGZip *gzip; g_return_val_if_fail (GSF_IS_INPUT (input), NULL); return xmlCreateIOParserCtxt ( NULL, NULL, (xmlInputReadCallback) gsf_libxml_read, (xmlInputCloseCallback) gsf_libxml_close, input, XML_CHAR_ENCODING_NONE); } ...works fine (but only with uncompressed files of course). Either gsf_input_seek() should be used if gsf_input_gzip_new() fails or gsf_input_gzip_new() should reset the file cursor before failing. Version-Release number of selected component (if applicable): 1.8.2 How reproducible: Always Steps to Reproduce: 1. Load an application that uses gsf_xml_parser_context(foo) into your favorite debugger. 2. Notice that foo->cur_offset is set to 10d after gsf_xml_parser_context is executed on an uncompressed file. Additional info:
As far as I can see this was fixed a while ago in input-gzip. init_zip seeks back to the onset on failure. I suspect the problem was actually different. the gnomevfs backend was broken until recently. It had reversed the sense of the return value for seek, and pretended to fail when it succeeded. I'm going to do a 1.9.0 release later this week with a patch.