After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 615777 - Gnumeric mangles dates from Excel .xlsx files
Gnumeric mangles dates from Excel .xlsx files
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export MS Excel (tm)
1.8.x
Other Linux
: Normal normal
: ---
Assigned To: Jody Goldberg
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2010-04-14 19:33 UTC by Penelope Fudd
Modified: 2010-04-15 04:42 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Spreadsheet showing the problem (27.53 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2010-04-14 19:33 UTC, Penelope Fudd
Details

Description Penelope Fudd 2010-04-14 19:33:03 UTC
Created attachment 158750 [details]
Spreadsheet showing the problem

I've been given an excel file in .xlsx format, and some dates are off by about four years.

See the attached file: on worksheet 1 (Soil sample), cells D3:D28; the dates should be in 2007, but instead appear as 2003.

Oddly enough, the dates on worksheet 2 (DW-Extraction-Quantification), cells G2:I29 have the correct year.

Opening the file with Openoffice 3.1.1 gives the correct dates.

This is on a Fedora 11 system, which is kept up-to-date with yum nightly.  The rpm is gnumeric-1.8.4-5.fc11.i586.rpm.
Comment 1 Andreas J. Guelzow 2010-04-15 01:48:59 UTC
Are you sure that Openoffice gives the correct dates? 

With OOo 3.1.1 I see:

on "Soil Samples" D3: 22/10/2007
on "...Extraction..." G2: 11-Aug-11

while current Gnumeric yields:

on "Soil Samples" D3: 10-21-03
on "...Extraction..." G2: 21-Aug-07

In both sstems the serial numbers of those cells are 37915 and 39315 respectively. So the difference is the null date (ie. the date with serial number 0.)

Considering that the file really stores the serial numbers and I suspect that these dates are supposed to be in th e2007, Excel appears to use different null dates on different sheets (perhaps these sheets originated on different machines.)

OOo uses a null date for these files of 1994/1/1 while gnumeric uses a null date of 1899/12/31 with a skip of 1 after 1900/2/28.

Note that by default OOo uses a null date of 1899/12/30 but adjusts it for these files. Gnumeric's adjustment is not triggered.
Comment 2 Andreas J. Guelzow 2010-04-15 01:54:53 UTC
Note that rather than 11-Aug-11 it should have been 22-Aug-11.

If I open this file in Excel 2007 I get the same dates as OOo. Are those 2011 dates correct?
Comment 3 Andreas J. Guelzow 2010-04-15 03:56:55 UTC
To cite from the "Open Office XML" standard:

"  3.16.9.1 Date Representation
(...)
   There are two different bases for serial values:
          In the 1900 date base system, the lower limit is January 1, 1900, which has serial value 1. The upper-limit is December 31, 9999, which has serial value 2,958,465.
          In the 1904 date base system, the lower limit is January 1, 1904, which has serial value 0. The upper-limit is December 31, 9999, which has serial value 2,957,003.

   As to which date base system an implementation uses by default or whether it allows its users to switch between date base systems, is unspecified.
"

So unless there is some specification in the file of which date system is used (which I haven't found yet), this should be considered an "Open Office XML" bug.
Comment 4 Andreas J. Guelzow 2010-04-15 04:18:14 UTC
Okay: Part 4/3.2.28 of "Open Office XML" gives a workbook property:

date1904 (Date 1904) which defaults to FALSE. If it is set to TRUE we should interpret serial numbers relative to 1904/1/1.

We apparently are ignoring this setting for this document.
Comment 5 Andreas J. Guelzow 2010-04-15 04:42:38 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.