GNOME Bugzilla – Bug 712142
Local content does not show up
Last modified: 2014-10-14 14:01:49 UTC
On a freshly installed Fedora 19 or Fedora 20 VM, I downloaded a PDF [1] and put it in ~/Documents. The PDF does not show up in the application. Running tracker-info on the file reveals that the following rdf:types are missing: 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Document' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#TextDocument' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#PaginatedTextDocument' The PDF extractor is present in the VM: /usr/lib64/tracker-0.16/extract-modules/libextract-pdf.so However, the same file does show up and has the expected rdf:types on my work laptop which is also running Fedora 20, but is definitely not a "clean" install. Remote content mined by gnome-online-miners show up as expected. [1] http://www.lostca.se/~rishi/International-mobile-subscriber-identity.pdf
Looks like this is crashing: $ /usr/libexec/tracker-extract -f /path/to/pdf Backtrace: (gdb) bt
+ Trace 232745
Created attachment 259662 [details] my tracker-extract -f output Here is what I am seeing - it still crashes for me in the end, but thats probably bad error handling from this broken situation. I checked, and the pdf extractor module *is* on my disk, in the right place.
Works in GNOME Continuous updated today.
(In reply to comment #2) > Created an attachment (id=259662) [details] > my tracker-extract -f output > > Here is what I am seeing - it still crashes for me in the end, but thats > probably bad error handling from this broken situation. > > I checked, and the pdf extractor module *is* on my disk, in the right place. I get this when I run it with 'valgrind --tool=memcheck'.
Turns out that Tracker's self-imposed memory limits are too low on a VM with 1G RAM for this to work. On such a system the limit is set to 512M. So, we should either teach Tracker to be smarter about setting the limits or fix poppler to be more efficient so that we don't need more than 512M to index a 64K PDF. Reassigning to tracker.
*** This bug has been marked as a duplicate of bug 737663 ***