GNOME Bugzilla – Bug 124930
Make xml probe more lenient, and display parse errors
Last modified: 2004-12-22 21:47:04 UTC
File-Open and select a just saved .gnumeric file with the file type "Automatically detected". The workbook is drawn with a single worksheet (named the name of the file). The first cell shows .PLD+-. None of the twelve worksheets are found! File opens OK if "Gnumeric XML file format" is chosen as file type.
What version is this? And what libgsf version?
Please give us more information here. Without it there is no way to debug this.
Created attachment 20869 [details] Gnumeric xml file that fails to open unless the type is deliberately set to "Gnumeric XML"
Sorry for the delay, I've had to pull the "old" environment out of the backup disk. The file was probably created around Jan '02 and variously modified. The last modification would have been with gnumeric-1.0.5-3 on an RH7.3 system before being transferred and accessed with gnumeric2 1.1.20-36 after the SuSE upgrade. File is attached, hope it helps. Gnumeric about box reports 1.1.20, installed from the DVD RPMs:- gnumeric2-1.1.20-36 libgsf-1.8.1-93
Hmm I can see why its acting strangely. The file contains some corrupt strings. Somewhere along the line the format strings that were saved with bogus random content. libxml2 is refusing to load the file because it is invalid. I suspect that what you're really doing when you force the type is to use the SAX based importer that will allow you get slightly further. A couple of points. 1) You can fix the file manually by gunziping it and editing the xml directly. Look for any string with [Red] in it. Those all seem to be followed by random crap. delete that and you should have the file back. 2) Gnumeric should be alot more lenient about probing, and check only the header, not the xml validity of the entire file. 3) Gnumeric should warn about why it is failing to read the xml
*** Bug 124932 has been marked as a duplicate of this bug. ***
This is handled alot more smoothly now. The probe routine is faster and smarter, so the file is recognized as gnumeric. The parse routines have at least started on displaying xml parse problems. It needs work to should be basicly functional.