GNOME Bugzilla – Bug 336858
Incorrectly reads beyond EOF
Last modified: 2006-04-08 03:15:30 UTC
Please describe the problem: Gnumeric does not stop reading biff-8 file after reaching the last BOF. It reports of multiple 'invalid DIMENSIONS record length 0' (obviously reading padding 0s) and does not open the workbook. Steps to reproduce: 1. Can submit a sample XLS file. 2. 3. Actual results: Gnumeric does not open the workbook. Expected results: Normal workbook, must be opened. Does this happen every time? Yes. Other information: The file is not created in Excel, though Excel opens the it normally. After that the file can be opened in Gnumeric because it gets reformatted.
A sample would be nice.
You can find the sample at http://solutionsinhand.com/8975/_1.zip
Confirmed. When loading the file, I get (from BIFF_DEBUG) for the WRITEACCESS field: Opcode 0x5c length 112 malloced? 0 Data: 0 | 03 00 00 4d 43 4d 20 20 20 20 20 20 20 20 20 20 | ...MCM.......... 10 | 20 20 20 20 20 20 20 20 20 ce 6e 00 00 00 00 00 | ..........n..... 20 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 30 | 00 2a 6b 00 00 00 00 00 00 00 00 00 00 00 00 00 | .*k............. 40 | 00 00 00 00 00 00 00 00 80 00 00 00 00 00 00 00 | ................ 50 | 20 00 00 00 00 00 00 00 60 2a 6b 00 00 00 00 00 | ........`*k..... 60 | 69 6f 6e 00 c3 2a 00 00 a0 2b 6b 00 00 00 00 00 | ion..*...+k..... We should have spaces everywhere past MCM. Next, the framework finds a 71 field with 0 length, but, there is no such octet in the file. Looks like a gsf bug?
I could track the problem to gsf, only the firsy 64 bytes are read. Using g_new0 instead of gnew when allocating the buffer gives 00 as opcode past the first four ones. Setting dirent->use_sb to FALSE in ole_dirent_new makes things work, but is not, of course, a good fix. Reassigning to libgsf.
Created attachment 62590 [details] a similar file which is correctly loaded I could not reproduce with any file generated with gnumeric, even attached file with the same size as _1.xls. I could not find a significative difference between the two files, at least the headers are identical (first 512 bytes), yet this one is correctly loaded. Very strange. I can't understand why one is read and not the other.
Hmm, found the problem. In _1.xls, the short-sector allocation table begins with -2 instead of 1. So we only read the first 64 bytes. How was this file generated? Seems the bugs comes from there...
As long as the problem is still open, you may want to look at another example: http://solutionsinhand.com/8975/g.zip same problem, but the error si different: (gnumeric.exe:1636): libgsf:msole-CRITICAL **: ole_get_block: assertion `block < ole->info->max_block' failed
In the latter case it looks like the bat is padded with 0xffffffff entries.
It's certainly odd to have unused blocks in a metabat. We can work around it by just assuming that an unused metabat is just a reference to a bunch of unused blocks, but there is still some strangeness in the generated xls.
hmm, looks minor. The externname records that were generated lacked definitions. If we correct for the truncated record things work ok. With the patch from #9 we can read _1.xls too, although that seems mostly pointless. There's no content in there other than the initial biff records and lots of 0 padding. What generated these files ?