Bug 739679 – Extended float field in lotus wk4 spreadsheets

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 739679 - Extended float field in lotus wk4 spreadsheets


Summary:	Extended float field in lotus wk4 spreadsheets


Status:	RESOLVED FIXED

Product:	Gnumeric
Classification:	Applications
Component:	import/export other
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Morten Welinder
QA Contact:	Jody Goldberg

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2014-11-05 17:49 UTC by Thomas Kluyver
Modified:	2014-11-06 19:05 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
WIP patch for extended float data (2.47 KB, patch) 2014-11-05 19:20 UTC, Thomas Kluyver	none	Details \| Review
Revised patch with Morten's changes (2.15 KB, patch) 2014-11-06 01:39 UTC, Thomas Kluyver	none	Details \| Review

Description Thomas Kluyver 2014-11-05 17:49:01 UTC

I needed to get some data out of an old wk4 spreadsheet saved by Lotus 1-2-3. On examination, most of the numbers were stored with opcode 0x17, which Gnumeric doesn't currently handle. This turned out to store an extended-precision 80 bit floating point number.

The attached patch is the current state of my work on this. It's loading most of the numbers in my spreadsheet correctly, but I'm hoping someone can help me with two, probably related problems:

- By trial and error, I found that I needed a bitshift right 32 to get the correct numbers. The Python code I wrote while working this out works correctly without that, but when I tried to translate it to C, it went wrong. I'm not very good at C, so I may well be missing something obvious.
- Small numbers, <2.0, are wrong - numbers 1.0 <= n < 2.0 are negated, and numbers <1.0 appear to run into some kind of overflow and become huge. Again, my Python code doesn't have this issue.

For reference, here's my Python code:

        a = int.from_bytes(r.data[-2:], 'little')
        sign = -1 if (a & (1<<15)) else 1
        e = a - (a & (1 << 15)) - 16383
        m = int.from_bytes(r.data[4:-2], 'little')
        val = m / (1<<(63-e))

Comment 1 Morten Welinder 2014-11-05 18:09:19 UTC

Interesting.  I would guess that what you need is the x86 format
described here:

    http://en.wikipedia.org/wiki/Extended_precision

Can you get me the bit patterns for a few sample numbers?
pi is always a good choice if you can control it.

Comment 2 Morten Welinder 2014-11-05 18:10:02 UTC

And a sample file would be good too.

Comment 3 Morten Welinder 2014-11-05 18:14:03 UTC

See also...

http://blogs.perl.org/users/rurban/2012/09/reading-binary-floating-point-numbers-numbers-part2.html

Look for function cvt_num10_num8.

Comment 4 Thomas Kluyver 2014-11-05 19:20:40 UTC

Created attachment 290047 [details] [review]
WIP patch for extended float data

Apparently the attachment didn't work - here's the patch.

I can't share the file, unfortunately (it contains private data), and I don't know how to create new ones without the relevant version of lotus. But here's a couple of sample patterns:

b'\x00\x00\x00\x00\x00\x00\xb0\x8a\n@' == 2219.0
b'3333333\x8b\x04@' == 34.8

(Byte patterns in Python repr format)

Comment 5 Morten Welinder 2014-11-05 23:14:30 UTC

I think your problem is

   1 << (63 - exp)

This is only well defined when the shift count is between 0 and 31 inclusive.

What you need here is ldexp, except that I would like you to use the
gnm_ldexp form.

Comment 6 Morten Welinder 2014-11-06 00:40:45 UTC

GnmValue *
lotus_extfloat (guint64 mant, guint16 signexp)
{	int exp = (signexp & 0x7fff) - 16383;
	int sign = (signexp & 0x8000) ? -1 : 1;
	/* FIXME: Special values may indicate NaN, +/- inf */
	return lotus_value (sign * gnm_ldexp (mant, exp - 63));
}

Comment 7 Thomas Kluyver 2014-11-06 01:39:52 UTC

Created attachment 290064 [details] [review]
Revised patch with Morten's changes

Thanks! With that change, all the values look right. I've incorporated it into this revised patch.

Comment 8 Morten Welinder 2014-11-06 15:19:08 UTC

This problem has been fixed in our software repository. The fix will go into the next software release. Thank you for your bug report.

Comment 9 Thomas Kluyver 2014-11-06 17:20:43 UTC

Thanks, Morten! :-)