After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 128936 - Text Export should permit selection of number, cell, etc formats
Text Export should permit selection of number, cell, etc formats
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export Text
git master
Other All
: Normal enhancement
: ---
Assigned To: Jody Goldberg
Jody Goldberg
: 129398 133991 152203 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2003-12-09 23:13 UTC by Stephen Waters
Modified: 2005-04-05 01:35 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gnumeric format original version (1.58 KB, application/x-gnumeric)
2003-12-09 23:15 UTC, Stephen Waters
Details
CSV result of date problem (17 bytes, text/plain)
2003-12-09 23:18 UTC, Stephen Waters
Details

Description Stephen Waters 2003-12-09 23:13:17 UTC
1. Enter a valid date in a cell
2. Change the format to Date (mine is mm/dd/yyyy)
3. Save As CSV (Comma Separated Values)
4. Open file in a text editor
5. Voila, you get an "answer" instead of a Date

Gnumeric should export Date format cells as strings, not mathematical
equations.

For "General" cells, perhaps the current behavior is acceptable... I don't
know what Excel does.

Thanks for all the hard work guys,
-s
Comment 1 Stephen Waters 2003-12-09 23:15:28 UTC
Created attachment 22278 [details]
gnumeric format original version
Comment 2 Stephen Waters 2003-12-09 23:18:18 UTC
Created attachment 22279 [details]
CSV result of date problem
Comment 3 Andreas J. Guelzow 2003-12-10 01:11:42 UTC
If your date is 12/12/2003 your `answer' is 37967 which is not the
result of calculating that expression but simply the normal way to
store that day (number of days since January 1, 1900).
Comment 4 Andreas J. Guelzow 2003-12-10 02:29:34 UTC
I don't think this is a bug, since what are the options? We could
export  the files using the displayed (formatted) strings, but hten we
run into the problem that we may loose information (eg. a date
formatted as mm/yyyy would lose the day, numbers may loose precision).

Perhaps this is really an enhancement request that users should be
able to choose to export the formatted strings rather than the raw
numbers.
Comment 5 Stephen Waters 2003-12-10 15:20:29 UTC
Well, the main problems with the current behavior are:

  1. Invalid dates, such as "08/00/2004" are, obviously, left as
strings. Yes, I have bad data like that I have to deal with. It *SUCKS*.

  2. It breaks reimportation of CSV data. "08/04/2003" is more likely
to be understood by a parser as a date without user intervention.

That's my impression, anyway. Thanks for replying so promptly!
Comment 6 Stephen Waters 2003-12-10 15:56:26 UTC
I just noticed that it does the conversion, even if you're using .txt
export! Ouch! Even if that's the default for CSV, please allow the
configurable Text Export to export dates as string.

Thank you!
Comment 7 Stephen Waters 2003-12-10 16:22:00 UTC
Note that OOo's behavior is to export dates as strings. Just tried it,
so I have a temporary solution -- but I still like Gnumeric more than
OOo. ;)
Comment 8 Morten Welinder 2003-12-10 19:13:43 UTC
I imagine we could export as 10-Dec-2003 for any date format.  I wonder
what OO does, precisely.

I don't see exporting as 12/10/2003 as an option.  That is bound to be
misinterpreted.  =DATE(2003,12,10) might work too.

For non-integers (or large values) for which such a date format would
not produce exact results, we might need to use the current number
format.
Comment 9 Stephen Waters 2003-12-10 19:19:34 UTC
I just meant that Gnumeric should export the date "as entered" unless
told to do otherwise. Perhaps in the text export screen, a dropdown
with "date export format" would be helpful.
Comment 10 Andreas J. Guelzow 2003-12-11 18:34:20 UTC
"As entered" can cause lots of problems:

Imagine I am running gnumeric in locale en_CA and enter 22/1/1959.
I'll save the file and later open it in locale C to enter more data,
let's say 1/22/1959.

When I then export, sould the exported file really contain both
22/1/1959 and 1/22/1959?  (Of course practically that doesn't work
anyways since we don't remember what was typed in since the parsing
depends on things like locale.)

Personally I would like the possibility of selecting the output format
as we do when we import.
Comment 11 Jody Goldberg 2003-12-15 06:55:30 UTC
Andreas : agreed.
Selecting export formats would be a nice enhancement.  In the interim
I don't see alot of choices other than what we're doing now.
Comment 12 Morten Welinder 2003-12-15 20:19:29 UTC
*** Bug 129398 has been marked as a duplicate of this bug. ***
Comment 13 David Mansfield 2003-12-15 20:49:18 UTC
i just posted (accidentally) a duplicate (oops, bugzilla doesn't like
plurals 'date' != 'dates') and I have one thing to add.

Excel exports dates as 'mm/dd/yyyy'.  I think Gnumeric should do this
too.  Pehaps the locale could be consulted instead.  But the current
behaviour is USELESS.  Oracle, for example, doesn't have a date
conversion for this 'days since 1899' crap AFAIK, which makes using
the data very difficult for real uses.
Comment 14 Andreas J. Guelzow 2003-12-21 07:19:16 UTC
I have added a checkbutton tot eh export dialog that when selected
causes the export to use the rendered strings as shown on screen. This
gives the user the option to export in the format of their choice. It
would still be nice to be able to select the formats directly in the
export druid so I am keeping this open as an enhancement.
Comment 15 Morten Welinder 2004-02-10 15:53:21 UTC
*** Bug 133991 has been marked as a duplicate of this bug. ***
Comment 16 Morten Welinder 2004-09-09 01:11:24 UTC
*** Bug 152203 has been marked as a duplicate of this bug. ***
Comment 17 Luke Hutchison 2004-09-09 06:12:47 UTC
I reported the latest dup of this bug.

I think if any format were to be chosen as a "default" for text export, it
should be the ISO standard of yyyy/mm/dd.  The reason for this is that it is
impossible to confuse the month and day in this format, because *nobody* uses
yyyy/dd/mm.  We can thank America [ :o) ] for the idiosyncratic mm/dd/yyyy
(which doesn't make any lexicographic sense at all -- only dd/mm/yyyy and
yyyy/mm/dd really are logical).  Plus the ISO standard keeps the month and day
in the same order as the US ordering of mm/dd, meaning that America can continue
to be blissfully unaware of why mm/dd/yyyy is illogical, as long as they can get
used to seeing the year written first!  The ISO date system also specifies that
year will always use 4 digits, preventing any Y3k problems.  The final, and most
important reason why ISO dates make sense is that they sort into chronological
order using a standard lexicographic (dictionary) sort ordering.  That means
that if your first field is a date, you can use emacs or commandline 'sort' or
some other tool to sort the lines of your CSV file, and they will actually end
up in the correct chronological order.  No other date system works like that.

However, having said that, Andreas mentioned in my dup that you can just check
"preserve formats" in Text Export, and dates are written correctly.  I think
that that may be a useful option to be able to *disable*, but I don't think it
makes sense to have this disabled by default -- especially for CSV, which has a
set of sensible and fairly-standard defaults already so that you don't have to
set any options in the Text Export dialog.

It would be nice if Save As CSV, and perhaps also Text Export, preserved formats
by default -- just because text-only formats are typically used to store values
and not formulae, so they are more presentation-oriented.
Comment 18 Andreas J. Guelzow 2004-09-09 12:50:49 UTC
There is a reason for defaulting to not preserving formats: Preserving the
formats used on the sheet may cause you to loose information. For example if the
display format is mm/dd then you will loose the year info.

Comment 19 Stephen Waters 2004-09-09 14:04:54 UTC
Not to start a flame war, but the American style of numeric dates evolved
directly from the textual "December Tenth, Nineteen Forty-one" -> "Dec 10, 1941"
-> "12/10/1941". So it is perfectly logical in a linguistic context.

Furthermore, many databases, such as Informix, default to MM/DD/YYYY. This is
why I need data to work in that format. No, I can't go around changing 20 years
of proprietary code to "fix" the "problem", much as I'd like to. (Actually, I'd
like to dump the system altogether, but that's even more off-topic...)
Comment 20 Morten Welinder 2004-09-09 14:09:20 UTC
There is a reason why the rest of the world says "the 10th [day] of December, 1941".

Anyway, a default format of mm/dd/yyyy is plain and simple not an option.  That
means immediate data loss for the dd/mm/yyyy people.  And vice versa.  If you
need either, specify it.
Comment 21 Stephen Waters 2004-09-09 14:16:47 UTC
Right, and the problem is that as of today, you can't specify that. :)
Comment 22 Luke Hutchison 2004-09-09 14:29:58 UTC
Hmm, question -- what does Informix do if it gets data in yyyy/mm/dd format?  I
just tested Gnumeric, and it handles it fine (i.e. automatically parses it
correctly as a date).  I would guess that Excel does too.  If Informix has had a
release in the last 5 years that increased standardization of things like dates,
it probably handles it too.  As I pointed out, the ISO standard is always
unambiguous, and it is very easy to detect that you have data in ISO standard
date format.
Comment 23 Stephen Waters 2004-09-09 15:42:07 UTC
Our ancient version of Informix has to be told via environment variable DBDATE
what date format is being used. This gets incredibly hairy when doing Informix
-> CSV -> Spreadsheet -> CSV -> Informix when you have bad data as well as
conversion issues. This is why I would prefer a nice text export for mm/dd/yyyy
instead of days since 1900.
Comment 24 Andreas J. Guelzow 2004-09-09 18:13:19 UTC
Stephen,

What do you mean with "as of today, you can't specify that.".

Format the date on the sheet in the format of your choice, then export using the
configurable export and check `preserve formats', and you get what you want.
Comment 25 Stephen Waters 2004-09-09 18:34:21 UTC
I just confirmed that this works. Ticket can be closed. Thanks everyone. :)
Comment 26 Andreas J. Guelzow 2004-09-09 18:49:10 UTC
Nope, we should keep it open: ideally you should be able to select the used cell
format in the export dialog and the formats should be classified as lossy or
not, so if a column uses a non-lossy format it can become the default. 
Comment 27 Morten Welinder 2005-04-01 21:14:38 UTC
Right now we have two options:

1. Raw numbers; dates as 34567.  <--- default
2. Formatted.

I am tempted to change this to:

1. Raw numbers.
2. Automatic.                    <--- default
3. Formatted.

"Auto" would re-format date/time-formattted cells in "yyyy/mm/dd" or
"yyyy/mm/dd hh:mm:ss.s" format as appropritate.
Comment 28 Luke Hutchison 2005-04-02 04:35:05 UTC
So "Automatic" would use the presentation format if it is non-lossy, and a
default (standards-compliant) format if it is lossy?  Looks like a good solution
to me, although it only really makes sense if "Automatic" is the default.  That
way, both people that explicitly want data in the presentation format and those
that want the raw numbers have to explicitly choose a non-default option, while
those that just want their data out in a sensible format that they can parse (I
would guess, most people in most situations) can get it by default.

By the way Excel doesn't have a way of specifying formatting options when saving
to a CSV, and it has caused a great deal of havoc where I work (dates seemingly
randomly get converted to integers, but it doesn't even seem to happen
consistently; regardless, every time it does happen, it renders the date useless
for our purposes).
Comment 29 Morten Welinder 2005-04-05 01:35:06 UTC
Fixed in cvs.