GNOME Bugzilla – Bug 594789
Problem parsing html file from abiword
Last modified: 2010-06-28 19:39:26 UTC
Hi all, I downloaded an .xls file which shows how much my bandwidth usage. This is in .xls format. I tried to use the 1.8 stable version, didn't work out. I tried to use 1.9.12 the latest developmental version, it shows but the formatting is all wrong. I have taken the sensitive material and put xxxx's in there.
Tried to add the .xls file but didn't work out. Filed bug 594790 for the same.
Please try once more to attach the file to this bug report. Please also describe which part of the formatting is incorrect.
If it still doesn't work to attach the file, please send it to aguelzow at pyrshep.ca Thanks
Created attachment 142944 [details] giving view of the xls sheet in gnumeric as its seen now.
as can be seen from the screenshot it doesn't format into columns as I suppose it should be doing.
Okay, the file in question is not really an xls file, so the xls extension is just misleading. It is really an html file: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <!-- ======================================================= --> <!-- Created by AbiWord, a free, Open Source wordprocessor. --> <!-- For more information visit http://www.abisource.com. --> <!-- ======================================================= --> <meta http-equiv="content-type" content="text/html;charset=UTF-8" /> <title>UsageReport.jsp</title> ... Do you know how that file was created? If you change the extension to html you can open it in any web browser and see the information. Of course gnumeric should do a much better job of parsing the content of the file.
The code in the html importer seems ancient. For example it assumes that we can't have mixed font styles in the same cell. The problem with reading this file is that the file contains nested tables. We handle the main table fine but are dumping the content of each nested table into a single cell.
Hi Andreas, The file was an .xls file which I got from my ISP. It said its a generic .xls file. I changed some information from it using abiword (hence you are getting the abiword stuff) as that time I didn't have gnumeric installed on my system. I would be blogging about the ISP using a closed format but even what you guys were able to deduce is good.
abiword opens .xls files? This does not work even with most recent abiword. It was probablynot an .xls file from the start.
Created attachment 142956 [details] the file in question
Andreas thank you for the upload. It would be nice to know how you were able to upload it when I couldn't. As far Jean, your question is I am thinking you may be right. I just uploaded a blog post which tells the history of the file http://flossexperiences.wordpress.com/2009/09/11/bsnl-2/ . I hope that also explains the abiword scenario. Abiword is able to open .xls files but it does a poor job of it at version 2.6.8 . I would also try the development version and see if it works out any better.
Abiword never opened a .xls file. Probably the original file was exported as html (possibly by excel) with a misleading .xls extension.
Hi guys, As far as abiword is concerned, I'm sure you are right. Its an html import issue there as well (I guess), put up a bug at abiword's bugzilla as well. http://bugzilla.abisource.com/show_bug.cgi?id=12361
shirish: I had no problem uploading the file (but I am not using Windows either), so I have no idea why you had difficulties.
*** Bug 615433 has been marked as a duplicate of this bug. ***
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.