GNOME Bugzilla – Bug 503629
Show results from documentation index in beagle-search
Last modified: 2008-02-03 03:42:55 UTC
Beagle-search does not show results from documentation index. Showing documentation results did not work well with best so it was disabled. It remained like that since then but beagle-search should not have any problem displaying them.
It is very easy to add support for showing documentation to beagle-search, but what do we use to open man files for example?
Just making random guesses here ... can we call yelp to open a man file ? http://www.gnome.org/learn/users-guide/latest/yelp.html#yelp-advanced-cmdline
Yeah, Yelp can open man files. I would think that it would be like any other file type.
Awesome. I will implement this on the frontend side. Meanwhile someone could fix indexing Monodoc files, becasuse currently they don't work. From what I've been able to trace is that FilterMonodoc correctly generates the child indexables from the monodoc zip files, correctly sets text/html mimetype on them. But in FilterFactory when we try to open the file it fails, becasue the path of the child item is set to file:///usr/lib/monodoc/sources/Mono.zip#T:WhateverType if (path != null) successful_open = candidate_filter.Open (path); else if ((text_reader = indexable.GetTextReader ()) != null) successful_open = candidate_filter.Open (text_reader); else if ((binary_stream = indexable.GetBinaryStream ()) != null) successful_open = candidate_filter.Open (binary_stream); Of course that is a file in the archive, so it fails to open it. The correct stuff is in the indexable.GetTextReader.
Are other archived documentation files working correctly ?
Calling beagle-extract-content on a zipped man page: lipec@frappr:~/svn/beagle/beagle$ beagle-extract-content --show-generated /usr/share/man/man1/du.1.gz Filename: file:///usr/share/man/man1/du.1.gz Debug: Done reading conf from /home/lipec/.beagle/config/Daemon.xml Debug: Done reading conf from /home/lipec/build/etc/beagle/config-files/Daemon.xml Debug: Loaded 58 filters from /home/lipec/build/lib/beagle/Filters/Filters.dll Debug: Verifying filter_cache at /home/lipec/.beagle/filterver.dat ... cache is dirty ? False Filter: Beagle.Filters.FilterArchive (determined in .38s) MimeType: application/x-gzip Filter-generated indexables: file:///usr/share/man/man1/du.1.gz#du.1 Properties: Timestamp = 2007-12-13 10:56:50 (Utc) beagle:FileType = archive Content: du.1 Text extracted in .01s ----------------------------------------- Filename: file:///usr/share/man/man1/du.1.gz#du.1 Parent: file:///usr/share/man/man1/du.1.gz Filter: Beagle.Filters.FilterMan (determined in .00s) MimeType: text/troff Properties: Timestamp = 2007-12-13 10:56:50 (Utc) beagle:ExactFilename = du.1 beagle:Filename = du beagle:FilenameExtension = .1 beagle:FileType = documentation beagle:NoPunctFilename = du beagle:SplitFilename = du fixme:inside_archive = true fixme:relativeuri = du.1 parent:beagle:FileType = archive Content: .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.35. ... <text truncated> Calling beagle-extract-content on a monodoc zip file: lipec@frappr:~/svn/beagle/beagle$ beagle-extract-content --show-generated /usr/lib/monodoc/sources/Novell.zip Filename: file:///usr/lib/monodoc/sources/Novell.zip Debug: Done reading conf from /home/lipec/.beagle/config/Daemon.xml Debug: Done reading conf from /home/lipec/build/etc/beagle/config-files/Daemon.xml Debug: Loaded 58 filters from /home/lipec/build/lib/beagle/Filters/Filters.dll Debug: Verifying filter_cache at /home/lipec/.beagle/filterver.dat ... cache is dirty ? False Filter: Beagle.Filters.FilterMonodoc (determined in 1.20s) MimeType: application/zip Filter-generated indexables: file:///usr/lib/monodoc/sources/Novell.zip%23T:Novell.Directory.Ldap.Connection+ReaderThread file:///usr/lib/monodoc/sources/Novell.zip%23C:Novell.Directory.Ldap.Connection+ReaderThread(Novell.Directory.Ldap.Connection) file:///usr/lib/monodoc/sources/Novell.zip%23M:Novell.Directory.Ldap.Connection+ReaderThread.Run() <snip> Properties: Timestamp = 2007-11-19 22:37:39 (Utc) beagle:FileType = documentation (no content) Text extracted in .00s ----------------------------------------- Filename: file:///usr/lib/monodoc/sources/Novell.zip%23T:Novell.Directory.Ldap.Connection+ReaderThread Parent: file:///usr/lib/monodoc/sources/Novell.zip Warn: Error in filtering /usr/lib/monodoc/sources/Novell.zip#T:Novell.Directory.Ldap.Connection+ReaderThread with Beagle.Filters.FilterHtml, falling back No filter for text/html ----------------------------------------- <snip>
Ahh... you are right. FilterMonodoc is missing the child.StoreStream() part - don't know whether the requirement was added after the monodoc filter was written, but adding the StoreStream,CloseStream fixes the actual problem (svn r4292). But I also noticed a huge blowup in memory in trying to index the huge Mono.zip docfile ... so I also took the liberty of replacing the existing child-indexable generation approach by a generator based approach. 4292 contains that change too.
*** Bug 370949 has been marked as a duplicate of this bug. ***
0.3.3 contains the final fixes for this. Documentation results require domain specific knowledge to be handled correctly and it is best left to yelp or other docbook apps. In the mean time, beagle-search --search-docs will do a simple search in the documentation index.