GNOME Bugzilla – Bug 157100
OO.o filter improperly ignores some parts of documents
Last modified: 2005-02-25 14:57:20 UTC
1. Put the word "SearchString" in the header of the oo-Writer document "sample.sxw". 2. Serach for "SearchString" in BEST. 3. The document sample.sxw is not listed in BESt. Additional Information: The string may be in any format (Normal, Bold, Italic, Underlined. etc..)
Why are these six different bugs and not one, given that they are all basically identical?
Though the bug-reproduction scenario is identical, the failure occurs for different conditions i.e on different parts of OO documnet, in different formats. For Eg. Beagle fails if the "SearchString" is in "Header" of the doc, no matter whatever format it is in. If "SearchString" is in Pagebody it fails only on certain formats. Hence, six different bugs.
I think this should be one bug. We are (maybe) ignoring text in: Headers, Footers, Footnotes, Endnotes, Indexes and sometimes in the page body for text w/ weird combinations of attributes (i.e. half the letters in a word bold, half not bold). Dave: Did this get fixed when you re-wrote the OO.o filter recently?
*** Bug 157101 has been marked as a duplicate of this bug. ***
*** Bug 157103 has been marked as a duplicate of this bug. ***
*** Bug 157104 has been marked as a duplicate of this bug. ***
*** Bug 157105 has been marked as a duplicate of this bug. ***
*** Bug 157106 has been marked as a duplicate of this bug. ***
The recent changes to OO.o by Dave didn't fix these bugs. I am working on this (these) bugs. Adding myself to CC.
Fix checked in, though this doesn't fix "partially-formatted-texts" bug. Working on it.
Fix checked in. This fixes the "partially-formatted-texts" bug.
URL reference are not indexed. Attaching sample document.
Created attachment 35418 [details] sample test document
Enhancement request: in attachment 35418 [details], text inside forms should also be indexed
Fixed handling of various combinations of text-formatting. I think this bug can be close, as the enhancement is not that Trivial. Jon?
Now that OO.o has gone to ODT (OpenDocument format). The filter extracts the contents, however, fails to mark *HOT* texts as hot. Working on it!!
Multi-line headers/footers are not properly indexed. (Both sxw and odt)
Fixed in CVS.