After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 594789 - Problem parsing html file from abiword
Problem parsing html file from abiword
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export HTML
1.9.x
Other All
: Normal normal
: ---
Assigned To: Andreas J. Guelzow
Jody Goldberg
: 615433 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2009-09-10 19:15 UTC by shirish agarwal
Modified: 2010-06-28 19:39 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
giving view of the xls sheet in gnumeric as its seen now. (7.17 KB, image/png)
2009-09-10 21:12 UTC, shirish agarwal
Details
the file in question (102.23 KB, text/html)
2009-09-11 06:40 UTC, Andreas J. Guelzow
Details

Description shirish agarwal 2009-09-10 19:15:41 UTC
Hi all, 
 I downloaded an .xls file which shows how much my bandwidth usage. This is in .xls format. I tried to use the 1.8 stable version, didn't work out. I tried to use 1.9.12 the latest developmental version, it shows but the formatting is all wrong. 

I have taken the sensitive material and put xxxx's in there.
Comment 1 shirish agarwal 2009-09-10 19:31:12 UTC
Tried to add the .xls file but didn't work out. Filed bug 594790 for the same.
Comment 2 Andreas J. Guelzow 2009-09-10 19:53:37 UTC
Please try once more to attach the file to this bug report.

Please also describe which part of the formatting is incorrect.
Comment 3 Andreas J. Guelzow 2009-09-10 19:54:31 UTC
If it still doesn't work to attach the file, please send it to aguelzow at pyrshep.ca

Thanks
Comment 4 shirish agarwal 2009-09-10 21:12:45 UTC
Created attachment 142944 [details]
giving view of the xls sheet in gnumeric as its seen now.
Comment 5 shirish agarwal 2009-09-10 21:14:00 UTC
as can be seen from the screenshot it doesn't format into columns as I suppose it should be doing.
Comment 6 Andreas J. Guelzow 2009-09-10 21:45:32 UTC
Okay, the file in question is not really an xls file, so the xls extension is just misleading. It is really an html file:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <!-- ======================================================= -->
  <!-- Created by AbiWord, a free, Open Source wordprocessor.  -->
  <!-- For more information visit http://www.abisource.com.    -->
  <!-- ======================================================= -->
  <meta http-equiv="content-type" content="text/html;charset=UTF-8" />
  <title>UsageReport.jsp</title>
...

Do you know how that file was created? If you change the extension to html you can open it in any web browser and see the information.

Of course gnumeric should do a much better job of parsing the content of the file.
Comment 7 Andreas J. Guelzow 2009-09-11 02:36:08 UTC
The code in the html importer seems ancient. For example it assumes that we can't have mixed font styles in the same cell.

The problem with reading this file is that the file contains nested tables. We handle the main table fine but are dumping the content of each nested table into a single cell.
Comment 8 shirish agarwal 2009-09-11 04:05:04 UTC
Hi Andreas, 
 The file was an .xls file which I got from my ISP. It said its a generic .xls file. I changed some information from it using abiword (hence you are getting the abiword stuff) as that time I didn't have gnumeric installed on my system. I would be blogging about the ISP using a closed format but even what you guys were able to deduce is good.
Comment 9 Jean Bréfort 2009-09-11 04:31:07 UTC
abiword opens .xls files? This does not work even with most recent abiword. It was probablynot an .xls file from the start.
Comment 10 Andreas J. Guelzow 2009-09-11 06:40:51 UTC
Created attachment 142956 [details]
the file in question
Comment 11 shirish agarwal 2009-09-11 06:52:49 UTC
Andreas thank you for the upload. It would be nice to know how you were able to upload it when I couldn't. 

As far Jean, your question is I am thinking you may be right. I just uploaded a blog post which tells the history of the file http://flossexperiences.wordpress.com/2009/09/11/bsnl-2/ . I hope that also explains the abiword scenario. Abiword is able to open .xls files but it does a poor job of it at version 2.6.8 . I would also try the development version and see if it works out any better.
Comment 12 Jean Bréfort 2009-09-11 07:12:16 UTC
Abiword never opened a .xls file. Probably the original file was exported as html (possibly by excel) with a misleading .xls extension.
Comment 13 shirish agarwal 2009-09-11 09:26:09 UTC
Hi guys, 
 As far as abiword is concerned, I'm sure you are right. Its an html import issue there as well (I guess), put up a bug at abiword's bugzilla as well. http://bugzilla.abisource.com/show_bug.cgi?id=12361
Comment 14 Andreas J. Guelzow 2009-09-11 13:07:17 UTC
shirish: I had no problem uploading the file (but I am not using Windows either), so I have no idea why you had difficulties.
Comment 15 Andreas J. Guelzow 2010-04-11 17:11:52 UTC
*** Bug 615433 has been marked as a duplicate of this bug. ***
Comment 16 Andreas J. Guelzow 2010-06-28 19:39:26 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.