GNOME Bugzilla – Bug 314875
Implementing 123 import for modern versions
Last modified: 2005-10-15 22:18:01 UTC
Open any .123 file that contains multiple worksheets. Filenames on all worksheets other than the first are different than they were in 1-2-3.
With the supplied test file I see one sheet only. It has the name "A" and is empty. # strings ~/test.123 "Arial Dick Harry foobar C:B1..C:B2 Dick Harry foobar Harlan Grove/ZI/USA/Zurich Harlan Grove/ZI/USA/Zurich 123 Property Doc Info Author Doc Info Comments Doc Info Editing Time Doc Info Last Revisor Doc Info Object Doc Info Revisions Count
cvs HEAD is quite broken, but that is easily fixable. It will not help with this file, though. The only specs we have, and the only that Google will admit to knowing, predates the version of 123 that made this file. I would guess that is what the OO guys had when they did their importer. Two things are wrong: 1. There are lots of record types that we know nothing of. In fact, most of this file's. 2. There is stuff behind the "EOF" marker that appears to be a different file format. If we are lucky it is only document metadata. We could probably reverse engineer this given enough effort and a suitable corpus.
I have implemented some basics: * Multiple sheets, with proper names. * Strings. I even seem to have the right character set. * Integers. Notably absent at this point are formulas and floating point values. I am going to need larger corpus for that. My current wishlist: 1a. A file with two entirely empty sheets. 1b. A file with three entirely empty sheets. [I would like to know of some header record that tells me the number of sheets before the data starts pouring in] 2: A file with the numbers -1; -2; -3; 0.5; -0.5; 1073741824; 0.33; 1e20 in cells B1 through B8 and corresponding texts in A1 through A8 for reference. Harlan: if you don't have time, just ignore and I will ask a suitable newsgroup after a while. Interestingly, Google appears to understand more of the format than I do. Searching for "filetype:123 if" gives a few samples.
I believe I have the packed number record decoded. It is... s * m * 10^(se * e) ...where... s is the sign bit stored as the (u & 0x20) bit. m is the mantissa stored as (u >> 6) se is the exponent sign bit stored as the (u & 0x10) bit. e is the base-10 exponent stored as (u & 0xf) and u is the 32-bit little-endian data area in the record. (And zero always seems to be stored as 0x00000014 -- go figure.)
We now have formulas and the non-packed numbers, as well as (untested) various error constants. The test sheet appears to load perfectly. An indication of this is that there are no changes after hitting "recalc". We still do not have formats. It is a largish project although the format is understood. And we definitely do not have things like graphs yet.
We have... 1. Sheets with names. 2. Constants, formulas. 2. Number formats, cell colouring, font sizes. 4. Column width, row height. 5. Cell comments. [It was easy, :-] We do not import graphs, and we do not import the wk4 format. I'm sure none of these are perfect, but it's a fair start. Some of this is in 1.6.0 and the rest will be in 1.6.1.