GNOME Bugzilla – Bug 392054
Crash on import of jworkbook file
Last modified: 2007-01-13 02:00:58 UTC
The attched file made with jworkbook (java lib to write .gnumeric files) crashes Gnumeric. File, gdb/bt attached. 1) New gnumeric 2) Open file => crash --adrian
Created attachment 79216 [details] A simple .gnumeric file made with jworkbook from java
Created attachment 79217 [details] gdb session log and backtrace
Crash fixed. The file is quite bogus and we still get these: ** (gnumeric:21565): WARNING **: Converted xml document with no explicit encoding from transliterated UTF-8 to UTF-8. ** (gnumeric:21565): WARNING **: too lazy to support nested unshared content for now. We'll add it for 2.0 ** (gnumeric:21565): CRITICAL **: xml_sax_cell_content: assertion `col >= 0' failed ** (gnumeric:21565): WARNING **: too lazy to support nested unshared content for now. We'll add it for 2.0 ** (gnumeric:21565): CRITICAL **: xml_sax_cell_content: assertion `col >= 0' failed ** (gnumeric:21565): WARNING **: too lazy to support nested unshared content for now. We'll add it for 2.0 ** (gnumeric:21565): CRITICAL **: xml_sax_cell_content: assertion `col >= 0' failed
Thanks for the fix. what does 'quite bogus' mean? The file is invalid v.7 output? What's 'transliterated utf-8'? Something do do with a difference between Java and Unix in the endian ordering of output? Any idea what the minimum effort will be to get these files to work again? thanks, --adrian
The problem is that there is no explicit encoding in the file. Newer Gnumerics write the encoding ("UTF-8") out. Older Gnumerics, and whatever wrote this file, did not and when they wrote a non-ASCII character they wrote (if memory serves) essentially the whatever-locale-there-was byte sequences like &#byte1;&#byte2;... However, now that I look closer I see that there aren't any such in this particular file. The errors above come from libgsf which doesn't seem to be up to the task of importing this class of files. I'm not sure why.
One thing is wrong with the file, though: it is missing a sheet name index like this: <gnm:SheetNameIndex> <gnm:SheetName>Run Info</gnm:SheetName> </gnm:SheetNameIndex> v7 was at Gnumeric 0.66 which stated: /* The sheet name index is required for the xml_sax * importer to work correctly. We don't use it for * the dom loader! These must be written BEFORE * the named expressions. */
Apart from the error resulting from the missing gnm:SheetNameIndex (causing an empty sheet to appear) and the barfing from libgsf, this is now fixed. Fixed in the development version. The fix will be available in the next major release. Thank you for your bug report.
My lazy attempt to copy a <gmr:SheetNameIndex /> structure into the file does not solve the problem. In one of my files (a small one with a single sheet) the file opens but the page is blank. In the larger file, gnumeric now pops up a dialog saying that the index is inconsistent---not sure why since it looks good to me.
The <gmr:SheetNameIndex> section needs to list all sheets in the file. If not, you get the "inconsistent" error. The page isn't blank, btw. It just looks that way: 1. The file turns off the grid explicitly. 2. We end up seeing an extra newline before the strings. (That's what the libgsf complaint is about.) Go to the cells and see the text.
libgsf part fixed.
Created attachment 80162 [details] A revised gnumeric file A gnumeric file to assess the validity of the output from the updated jworkbook.
So the output now contains an encoding definition and a SheetNameIndex. Gnumeric Head seems to open it without complaints. It also opens the other files correctly. Is there a DTD to see if the files as produced are correct? I'd like to be sure that these files will continue to work with future versions of Gnumeric. --adrian