GNOME Bugzilla – Bug 339192
saving imported xl5 as gnumeric then opening loses sheets
Last modified: 2011-06-01 00:52:43 UTC
Please describe the problem: I have a spreadsheet in xl5 format (exported as such by applix), which I can open in Gnumeric, with no problems. I can see all of the data and things look fairly good. So I save the file in gnumeric format (e.g. foo.gnumeric). However, opening foo.gnumeric tells me: "Unknown sheet 'XXX'" where XXX is the seventh of seven sheets. The data in XXX is no longer accessible. I have looked at the XML in an uncompressed foo.gnumeric and it looks like XXX is actually there, but may be corrupted in a subtle manner and is abandoned during load somehow. Steps to reproduce: 1. 2. 3. Actual results: Expected results: Does this happen every time? Yes Other information: Unfortunately the spreadsheet contains proprietary information, but I am hopefully able to assist with debugging.
Could you show the output of zcat < foo.gnumeric | grep SheetName as well as zcat < foo.gnumeric | grep XXX [And if you are redacting the output, please describe how -- something like "all letters replaces by X"]
"ES [H]" is the real XXX... zcat < foo.gnumeric | grep SheetName : 41: <gnm:SheetNameIndex> 42: <gnm:SheetName>M [A]</gnm:SheetName> 43: <gnm:SheetName>SM [B]</gnm:SheetName> 44: <gnm:SheetName>SB [C]</gnm:SheetName> 45: <gnm:SheetName>DM [D]</gnm:SheetName> 46: <gnm:SheetName>DB [E]</gnm:SheetName> 47: <gnm:SheetName>NM [F]</gnm:SheetName> 48: <gnm:SheetName>NB [G]</gnm:SheetName> 49: <gnm:SheetName>ES [H]</gnm:SheetName> 50: <gnm:SheetName>S [I]</gnm:SheetName> 51: </gnm:SheetNameIndex> zcat < foo.gnumeric | grep XXX : <gnm:SheetName>ES [H]</gnm:SheetName> <gnm:value>'ES [H]'!$E$6:$E$9</gnm:value> ... more similar value statements removed ... <gnm:Cell Col="52" Row="3" ValueType="60">ES [H]:</gnm:Cell> <gnm:Name>ES [H]</gnm:Name>
Hmm.. I'm unable to trigger anything like that. If you took a copy of this file and... 1. Removed all lines with <gnm:Cell ... </gnm:Cell> 2. Removed all <gnm:Styles> .. </gnm:Styles> groups ...does it still trigger the error? If so, can you show me the resulting file? I am guessing that the problem is related to a defined name.
It worked fine w/ all of the styles and cells removed, so I tried to chase down the culprit... Turns out that the "NM [F]" sheet contains the cell: <gnm:Cell Col="7" Row="15" ValueType="60">QQQ Fut FV</gnm:Cell> which gives Gnumeric major indigestion. In case it does not pass through this entry form, there are two ^H characters between QQQ and Fut... Which look suspicious to me... There are also numerous other cells that cause warning messages, which I can also relay if desired, but this one seems to terminate parsing... Tammo
To add a bit more info: These characters seem to be in the xl5 file (although not in the orig applix sheet that I can see, hence the applix export created them). While saving in .gnumeric format they were preserved -- but the .gnumeric loader is not robust enough to tolerate them.
"... hence the applic _import_ created them ..." I assume. I am greeted with "internal error" if I feed Gnumeric a file with two ^Hs inside a string. Not good.
Well, I started the whole game with an applix sheet, wherein I used applix to "export" into xl5 format. This xl5 file is the first place where these nasty ^H-isms occur. So I did kind of mean _export_ :)
The "internal error" comes from libxml2, see bug 339311.
Note, that libgsf has been patched to drop ^H. That is the best we can do without going to xml 1.1. ^H cannot occur in an xml 1.0 file without adding an encoding layer, see bug 339335.
It looks to me that this is now fixed as far as Gnumeric is concerned.
Please reopen this report if I misread Comment #9 and this is not effectively fixed (by dropping the ^H).