After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 627509 - ODF import: no text:tab, text:s handling
ODF import: no text:tab, text:s handling
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export OOo / OASIS
git master
Other All
: Normal trivial
: ---
Assigned To: Andreas J. Guelzow
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2010-08-20 15:49 UTC by Morten Welinder
Modified: 2011-11-01 01:40 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Rich text sampleA1 contains (7.63 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-10-29 19:04 UTC, Morten Welinder
  Details
debug patch (4.96 KB, patch)
2011-10-29 19:05 UTC, Morten Welinder
none Details | Review

Description Morten Welinder 2010-08-20 15:49:53 UTC
Unexpected element 'text:tab' in state : 
        document-content -> body -> spreadsheet -> table -> table-row -> table-cell -> annotation -> p -> span

https://bugzilla.gnome.org/attachment.cgi?id=168413
Comment 1 Andreas J. Guelzow 2010-08-20 16:39:26 UTC
This exposes the fact that we don't handle white space correctly on reading. Currently we cannot have mixed elements and content. I am pretty sure that there are some bugs around related to that but I can't find them.

Essentially  gsf_xml_in_doc_parse  needs to be extended to handle things like:
<span> a <text:tab /> b </span>.
Comment 2 Andreas J. Guelzow 2011-10-28 17:32:37 UTC
Other things we can't handle at this time are such as:

<text:p>
<text:span text:style-name="T1">xxx</text:span>
yyy
<text:span text:style-name="T2">zzz</text:span>
</text:p>
Comment 3 Morten Welinder 2011-10-29 19:04:35 UTC
Created attachment 200259 [details]
Rich text sampleA1 contains 

A1 contains "text is bold and italic and underlined" where the three
adjectives are styled as they say.
Comment 4 Morten Welinder 2011-10-29 19:05:25 UTC
Created attachment 200260 [details] [review]
debug patch

Patch for debugging
Comment 5 Morten Welinder 2011-10-29 19:09:05 UTC
I get...

Start of span: [text is ]
End of span: [text is bold]
Start of span: [text is bold]
End of span: [text is bold and ]
Start of span: [text is bold and ]
End of span: [text is bold and italic]
Start of span: [text is bold and italic]
End of span: [text is bold and italic and ]
Start of span: [text is bold and italic and ]
End of span: [text is bold and italic and underlined]

From the first two lines (and style info not printed) we deduce that characters
8-11 should be bold.

I don't think any information is missing, i.e., we should be able to read this.
Comment 6 Morten Welinder 2011-10-29 19:19:58 UTC
Relatedly, this "vile hack" from gsf_xml_in_doc_add_nodes seems to be
what enables it to work by having a bug.  The code and the comment
certainly do not agree and the code effectively does nothing.

			node = g_new0 (GsfXMLInNodeInternal, 1);
			node->pub = *e_node;
			/* WARNING VILE HACK :
			 * The api in 1.8.2 passed has_content as a boolean.
			 * Too many things still use that to change yet.  We
			 * edit the bool here to be GSF_CONTENT_NONE or
			 * GSF_XML_CONTENT and try to ignore SHARED_CONTENT */
			if (node->pub.has_content != 0 &&
			    node->pub.has_content != GSF_XML_SHARED_CONTENT)
				node->pub.has_content = GSF_XML_CONTENT;
Comment 7 Andreas J. Guelzow 2011-10-31 23:33:41 UTC
text:tab and text:s are now correctly handled on import. (So is text:line-break.)

text:span is bug #663135. text:a is bug #603533.

This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.
Comment 8 Andreas J. Guelzow 2011-11-01 01:40:50 UTC
I guess 

Start of span: [text is ]
End of span: [text is bold]

(plus style info) does not mean that  we deduce that characters
8-11 should be bold.

The content does not show other elements such as text:s that may create additional characters.

But nevertheless we can build a stack indicating where we need to start...