GNOME Bugzilla – Bug 371152
External filter documents never have snippets
Last modified: 2018-07-03 09:55:46 UTC
"beagle-query --verbose querystring" always returns "(null)" for MS Word documents. Other document types sometimes have snippets, and sometimes don't. But MS Word documents never have any. wv version is 1.2.4-1.fc6
Because MS Word Documents are not text they _should_ be storing snippetts in the textcache for retreval. Could you please run beagle-extract-content /path/of/a/doc/file and tell us what happens?
Filename: file:///u/samba/public/nautilus/f/EmployeeTimeRecord.doc Debug: Loaded 45 filters from /usr/lib/beagle/Filters/Filters.dll Debug: No filter for file:///u/samba/public/nautilus/f/EmployeeTimeRecord.doc (/u/samba/public/nautilus/f/EmployeeTimeRecord.doc) [application/msword] No filter for application/msword
Ok, did you compile beagle on your own, or did you get it from your distro? If you got it from your distro, you have to ask them why they didn't package it. If you compliled it, check the output of configure and see if its actually building wv1 support.
This is the stock Fedora Core 6 beagle. I'll report the bug downstream. In the mean time, an external filter seems to do the trick.
Maybe check this out? http://www.antezeta.com/beagle-fedora.html Let me know what the response is upstream, and well adjust this bug accordingly.
Well, I said the external filter seems to do the trick, based upon the fact that beagle-extract-content now works. But I did a beagle-shutdown, deleted everything in ~/.beagle/ except for 'config', and restarted beagled. Still no snippets for msword files. (I always get the string "(null)" for the snippet. Excel spreadsheets have snippets as expected. Is there another problem? Or am I missing something?
Modifying the Fedora SRPM to include --enable-wv1 fixes this. So it is a Fedora issue. The only possible remaining issue I see here is the fact that while my external filter got beagle-extract-content working, it did *not* get me snippets. With the internal filter working, I *do* get snippets.
Thanks for the info, retargetting this slightly.
Interstingly enough, now I have removed the Fedora packages and compiled 0.2.12 from beagle-project.org source, with --enable-wv1. Configure reports at the end that wv1 is enabled. I have wv1 and wv1-devel installed. But beagle-extract-content still says there is no filter for application/msword, and they don't get snippets, or get indexed by content. Only by path name. My own external filter uses wvText and works fine, except for no snippets. The word docs do get indexed properly and beagle-extract-content works. --enable-wv1 was the only option I gave to ./configure.
The Word support is broken in 0.2.12, I just committed a fix for this and I'll be releasing 0.2.13 soon.
Beagle is not under active development anymore and had its last code changes in early 2011. Its codebase has been archived (see bug 796735): https://gitlab.gnome.org/Archive/beagle/commits/master "tracker" is an available alternative. Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect reality. Please feel free to reopen this ticket (or rather transfer the project to GNOME Gitlab, as GNOME Bugzilla is deprecated) if anyone takes the responsibility for active development again.