GNOME Bugzilla – Bug 627509
ODF import: no text:tab, text:s handling
Last modified: 2011-11-01 01:40:50 UTC
Unexpected element 'text:tab' in state : document-content -> body -> spreadsheet -> table -> table-row -> table-cell -> annotation -> p -> span https://bugzilla.gnome.org/attachment.cgi?id=168413
This exposes the fact that we don't handle white space correctly on reading. Currently we cannot have mixed elements and content. I am pretty sure that there are some bugs around related to that but I can't find them. Essentially gsf_xml_in_doc_parse needs to be extended to handle things like: <span> a <text:tab /> b </span>.
Other things we can't handle at this time are such as: <text:p> <text:span text:style-name="T1">xxx</text:span> yyy <text:span text:style-name="T2">zzz</text:span> </text:p>
Created attachment 200259 [details] Rich text sampleA1 contains A1 contains "text is bold and italic and underlined" where the three adjectives are styled as they say.
Created attachment 200260 [details] [review] debug patch Patch for debugging
I get... Start of span: [text is ] End of span: [text is bold] Start of span: [text is bold] End of span: [text is bold and ] Start of span: [text is bold and ] End of span: [text is bold and italic] Start of span: [text is bold and italic] End of span: [text is bold and italic and ] Start of span: [text is bold and italic and ] End of span: [text is bold and italic and underlined] From the first two lines (and style info not printed) we deduce that characters 8-11 should be bold. I don't think any information is missing, i.e., we should be able to read this.
Relatedly, this "vile hack" from gsf_xml_in_doc_add_nodes seems to be what enables it to work by having a bug. The code and the comment certainly do not agree and the code effectively does nothing. node = g_new0 (GsfXMLInNodeInternal, 1); node->pub = *e_node; /* WARNING VILE HACK : * The api in 1.8.2 passed has_content as a boolean. * Too many things still use that to change yet. We * edit the bool here to be GSF_CONTENT_NONE or * GSF_XML_CONTENT and try to ignore SHARED_CONTENT */ if (node->pub.has_content != 0 && node->pub.has_content != GSF_XML_SHARED_CONTENT) node->pub.has_content = GSF_XML_CONTENT;
text:tab and text:s are now correctly handled on import. (So is text:line-break.) text:span is bug #663135. text:a is bug #603533. This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.
I guess Start of span: [text is ] End of span: [text is bold] (plus style info) does not mean that we deduce that characters 8-11 should be bold. The content does not show other elements such as text:s that may create additional characters. But nevertheless we can build a stack indicating where we need to start...