After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 649406 - MS Works import of formulas
MS Works import of formulas
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export other
git master
Other All
: Normal normal
: ---
Assigned To: Morten Welinder
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2011-05-04 19:34 UTC by Andreas J. Guelzow
Modified: 2011-05-07 09:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
sample ms works file (563.33 KB, application/vnd.lotus-1-2-3)
2011-05-04 19:43 UTC, Andreas J. Guelzow
Details
sample xls file (326.00 KB, application/vnd.ms-excel)
2011-05-04 19:45 UTC, Andreas J. Guelzow
Details
proposed patch (10.11 KB, text/x-c)
2011-05-06 04:23 UTC, Andreas J. Guelzow
Details

Description Andreas J. Guelzow 2011-05-04 19:34:25 UTC
When importing the file from bug #645940, then B1 contains
├ݯý¹ ß¹Ù¹§ Ô­ÕýÕÝ
It probably should be
Гномы былых времен
or something like that.
Comment 1 Andreas J. Guelzow 2011-05-04 19:43:55 UTC
Created attachment 187232 [details]
sample ms works file
Comment 2 Andreas J. Guelzow 2011-05-04 19:45:05 UTC
Created attachment 187233 [details]
sample xls file

THis should be the same file as the ms works file in xls format for comparison
Comment 3 Andreas J. Guelzow 2011-05-04 19:46:27 UTC
Comparing the sample xls and wks files after import into Gnumeric shows the problem with the non-ascii content for example in cells C6 and D6.
Comment 4 Andreas J. Guelzow 2011-05-04 19:50:28 UTC
Also the formula import in J6 to J17 is messed up.
Comment 5 Morten Welinder 2011-05-04 20:02:04 UTC
Here's the bits:

String at C6:
       0 | 02 00 05 00 00 00 97 97 00 XX XX XX XX XX XX XX | .........*******
String at D6:
       0 | 03 00 05 00 03 00 97 97 00 XX XX XX XX XX XX XX | .........*******
String at E6:
       0 | 04 00 05 00 06 00 42 61 6c 61 6e 63 65 20 46 6f | ......Balance.Fo
      10 | 72 77 61 72 64 00 XX XX XX XX XX XX XX XX XX XX | rward.**********

Here's our guess at the first 6 bytes' meaning:

			int col = GSF_LE_GET_GUINT16 (r->data + 0);
			int row = GSF_LE_GET_GUINT16 (r->data + 2);
			int fmt = GSF_LE_GET_GUINT16 (r->data + 4);
Comment 6 Morten Welinder 2011-05-04 20:06:40 UTC
The styles referred to ("fmt") are...

Style 0:
       0 | 06 08 11 11 04 00 00 00 bb bb XX XX XX XX XX XX | ..........******
Style 3:
       0 | 05 08 00 00 04 00 00 00 00 00 XX XX XX XX XX XX | ..........******
Style 6:
       0 | 05 02 11 11 04 00 00 00 bb bb XX XX XX XX XX XX | ..........******
Comment 7 Morten Welinder 2011-05-04 20:15:58 UTC
All refer to the font:

Font 4:
       0 | 00 00 54 72 65 62 75 63 68 65 74 20 4d 53 00 09 | ..Trebuchet.MS..
      10 | 00 00 04 00 01 00 00 05 00 01 00 ff 01 54 0c 00 | .............T..
      20 | 00 00 20 00 14 00 XX XX XX XX XX XX XX XX XX XX | ......**********
Comment 8 Morten Welinder 2011-05-04 20:23:47 UTC
In the string was interpreted as codepage 1252, things would be right.
That won't work over in bug #645940, so we need to find a codepage
somewhere.
Comment 9 Morten Welinder 2011-05-04 20:40:20 UTC
We have codepage support, but currently a constant 1252.
Comment 10 Morten Welinder 2011-05-04 20:50:34 UTC
I fear we somehow need to know that the font "Arial CYR" uses codepage 1251.
Comment 11 Morten Welinder 2011-05-05 13:32:10 UTC
Charset issues fixed.  That still leaves the messed up formulae.
Comment 12 Morten Welinder 2011-05-05 15:41:05 UTC
Formula at J6:
       0 | 09 00 05 00 27 00 ff ff ff ff ff ff f0 7f 23 00 | ....'.........#.
      10 | 01 fe bf 01 80 06 00 0e 01 fe bf 02 80 06 00 0e | ................
      20 | 01 fe bf 03 80 06 00 0e 8d 03 01 fe bf 00 80 06 | ................
      30 | 00 3b 03 XX XX XX XX XX XX XX XX XX XX XX XX XX | .;.*************

This parses as post-fix sequence...

<some var>
""
=
<some var>
""
=
<some var>
""
=
3-arg function 0x8d (should be AND; we have WEEKDAY)
<some var>
""
3-arg function 0x3b (IF)


What this means is that we need a new ordinal-to-function mapping for works.
I am unable to wring one out of google.
Comment 13 Andreas J. Guelzow 2011-05-05 17:03:31 UTC
My Lotus123 Release 2 manual from 1985 does not show any WEEKDAY or AND functions. My Lotus123 Release 5 manual from 1994 does not list any individual functions so they (or any one of them) may have been added later.

I do note that our Lotus translation is not quite complete and should be updated. For example Gnumeric does have a SHEET function now.

One way of fixing the functions issue would be to duplicate the existing translation and then modify it for MS Works as we encounter differences. I would suspect that MS Works used the function codes of some early release of Lotus123 and afterwards both products added functions for whom the code differed.
Comment 14 Andreas J. Guelzow 2011-05-05 17:18:34 UTC
My Release 2 manual seems to list all the functions up to 0x79 except "APP". Perhaps we can assume that these functions match between LOTUS and WORKS and the functions beyond 0x79 don't.
Comment 15 Morten Welinder 2011-05-05 17:39:53 UTC
If we had something that can write Works files, translating
samples/excel/*.xls into wks files would make it possible to, mostly,
create the table.

Lotus surely had "AND", but note that the names we have are those
names that _we_ use, not necessarily what Works use.
Comment 16 Andreas J. Guelzow 2011-05-05 17:44:55 UTC
I have the original release 2 manual and at that time LOTUS clearly had no AND (which is never needed anyways since multiplication does the same thing)
Comment 17 Andreas J. Guelzow 2011-05-05 17:49:45 UTC
I should have said that they clearly had no AND function. They did have an _operator_ "#AND#" (and #OR# and #NOT#) with the #AND# and #OR# at precedence 1 and #NOT# at precedence 2.
Comment 18 Andreas J. Guelzow 2011-05-06 04:23:25 UTC
Created attachment 187337 [details]
proposed patch

THis patch allows the sample.wks file to be read correctly. It assumes that MS works uses the old lotus123 format with additional functions.

Since we don't know the functions we are just using the original Lotus123 functions (from early releases) and OR and AND. 

I think it is better to raise an error than to switch function anmes.
Comment 19 Morten Welinder 2011-05-06 15:19:58 UTC
Note, that LOTUS_VERSION_ORIG_123 == WORKS_VERSION_3.
The make_function check therefore doesn't work in the way you expect.
Comment 20 Andreas J. Guelzow 2011-05-06 16:17:54 UTC
We clearly should not have LOTUS_VERSION_ORIG_123 == WORKS_VERSION_3.
Comment 21 Andreas J. Guelzow 2011-05-07 09:41:57 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.