GNOME Bugzilla – Bug 649406
MS Works import of formulas
Last modified: 2011-05-07 09:41:57 UTC
When importing the file from bug #645940, then B1 contains ├ݯý¹ ß¹Ù¹§ ÔÕýÕÝ It probably should be Гномы былых времен or something like that.
Created attachment 187232 [details] sample ms works file
Created attachment 187233 [details] sample xls file THis should be the same file as the ms works file in xls format for comparison
Comparing the sample xls and wks files after import into Gnumeric shows the problem with the non-ascii content for example in cells C6 and D6.
Also the formula import in J6 to J17 is messed up.
Here's the bits: String at C6: 0 | 02 00 05 00 00 00 97 97 00 XX XX XX XX XX XX XX | .........******* String at D6: 0 | 03 00 05 00 03 00 97 97 00 XX XX XX XX XX XX XX | .........******* String at E6: 0 | 04 00 05 00 06 00 42 61 6c 61 6e 63 65 20 46 6f | ......Balance.Fo 10 | 72 77 61 72 64 00 XX XX XX XX XX XX XX XX XX XX | rward.********** Here's our guess at the first 6 bytes' meaning: int col = GSF_LE_GET_GUINT16 (r->data + 0); int row = GSF_LE_GET_GUINT16 (r->data + 2); int fmt = GSF_LE_GET_GUINT16 (r->data + 4);
The styles referred to ("fmt") are... Style 0: 0 | 06 08 11 11 04 00 00 00 bb bb XX XX XX XX XX XX | ..........****** Style 3: 0 | 05 08 00 00 04 00 00 00 00 00 XX XX XX XX XX XX | ..........****** Style 6: 0 | 05 02 11 11 04 00 00 00 bb bb XX XX XX XX XX XX | ..........******
All refer to the font: Font 4: 0 | 00 00 54 72 65 62 75 63 68 65 74 20 4d 53 00 09 | ..Trebuchet.MS.. 10 | 00 00 04 00 01 00 00 05 00 01 00 ff 01 54 0c 00 | .............T.. 20 | 00 00 20 00 14 00 XX XX XX XX XX XX XX XX XX XX | ......**********
In the string was interpreted as codepage 1252, things would be right. That won't work over in bug #645940, so we need to find a codepage somewhere.
We have codepage support, but currently a constant 1252.
I fear we somehow need to know that the font "Arial CYR" uses codepage 1251.
Charset issues fixed. That still leaves the messed up formulae.
Formula at J6: 0 | 09 00 05 00 27 00 ff ff ff ff ff ff f0 7f 23 00 | ....'.........#. 10 | 01 fe bf 01 80 06 00 0e 01 fe bf 02 80 06 00 0e | ................ 20 | 01 fe bf 03 80 06 00 0e 8d 03 01 fe bf 00 80 06 | ................ 30 | 00 3b 03 XX XX XX XX XX XX XX XX XX XX XX XX XX | .;.************* This parses as post-fix sequence... <some var> "" = <some var> "" = <some var> "" = 3-arg function 0x8d (should be AND; we have WEEKDAY) <some var> "" 3-arg function 0x3b (IF) What this means is that we need a new ordinal-to-function mapping for works. I am unable to wring one out of google.
My Lotus123 Release 2 manual from 1985 does not show any WEEKDAY or AND functions. My Lotus123 Release 5 manual from 1994 does not list any individual functions so they (or any one of them) may have been added later. I do note that our Lotus translation is not quite complete and should be updated. For example Gnumeric does have a SHEET function now. One way of fixing the functions issue would be to duplicate the existing translation and then modify it for MS Works as we encounter differences. I would suspect that MS Works used the function codes of some early release of Lotus123 and afterwards both products added functions for whom the code differed.
My Release 2 manual seems to list all the functions up to 0x79 except "APP". Perhaps we can assume that these functions match between LOTUS and WORKS and the functions beyond 0x79 don't.
If we had something that can write Works files, translating samples/excel/*.xls into wks files would make it possible to, mostly, create the table. Lotus surely had "AND", but note that the names we have are those names that _we_ use, not necessarily what Works use.
I have the original release 2 manual and at that time LOTUS clearly had no AND (which is never needed anyways since multiplication does the same thing)
I should have said that they clearly had no AND function. They did have an _operator_ "#AND#" (and #OR# and #NOT#) with the #AND# and #OR# at precedence 1 and #NOT# at precedence 2.
Created attachment 187337 [details] proposed patch THis patch allows the sample.wks file to be read correctly. It assumes that MS works uses the old lotus123 format with additional functions. Since we don't know the functions we are just using the original Lotus123 functions (from early releases) and OR and AND. I think it is better to raise an error than to switch function anmes.
Note, that LOTUS_VERSION_ORIG_123 == WORKS_VERSION_3. The make_function check therefore doesn't work in the way you expect.
We clearly should not have LOTUS_VERSION_ORIG_123 == WORKS_VERSION_3.
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.