After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 712142 - Local content does not show up
Local content does not show up
Status: RESOLVED DUPLICATE of bug 737663
Product: tracker
Classification: Core
Component: Extractor
unspecified
Other Linux
: Normal normal
: ---
Assigned To: tracker-extractor
Depends on:
Blocks:
 
 
Reported: 2013-11-12 10:35 UTC by Debarshi Ray
Modified: 2014-10-14 14:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
my tracker-extract -f output (1.97 KB, text/plain)
2013-11-12 13:51 UTC, Matthias Clasen
Details

Description Debarshi Ray 2013-11-12 10:35:24 UTC
On a freshly installed Fedora 19 or Fedora 20 VM, I downloaded a PDF [1] and put it in ~/Documents. The PDF does not show up in the application. Running tracker-info on the file reveals that the following rdf:types are missing:
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Document'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#TextDocument'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#PaginatedTextDocument'

The PDF extractor is present in the VM: /usr/lib64/tracker-0.16/extract-modules/libextract-pdf.so

However, the same file does show up and has the expected rdf:types on my work laptop which is also running Fedora 20, but is definitely not a "clean" install.

Remote content mined by gnome-online-miners show up as expected.

[1] http://www.lostca.se/~rishi/International-mobile-subscriber-identity.pdf
Comment 1 Debarshi Ray 2013-11-12 11:25:40 UTC
Looks like this is crashing:
$ /usr/libexec/tracker-extract -f /path/to/pdf

Backtrace:

(gdb) bt
  • #0 _xdg_mime_magic_read_to_newline
    at xdgmimemagic.c line 184
  • #1 _xdg_mime_magic_parse_header
    at xdgmimemagic.c line 273
  • #2 _xdg_mime_magic_read_magic_file
    at xdgmimemagic.c line 772
  • #3 __gio_xdg_magic_read_from_file
    at xdgmimemagic.c line 814
  • #4 xdg_mime_init_from_directory
    at xdgmime.c line 191
  • #5 xdg_run_command_on_dirs
    at xdgmime.c line 288
  • #6 xdg_mime_init
    at xdgmime.c line 456
  • #7 _gio_xdg_get_mime_types_from_file_name
    at xdgmime.c line 587
  • #8 g_content_type_guess
    at gcontenttype.c line 657
  • #9 get_content_type
    at glocalfileinfo.c line 1242
  • #10 _g_local_file_info_get
    at glocalfileinfo.c line 1829
  • #11 g_local_file_query_info
    at glocalfile.c line 1232
  • #12 extract_task_new
    at tracker-extract.c line 452
  • #13 tracker_extract_get_metadata_by_cmdline
    at tracker-extract.c line 851
  • #14 run_standalone
    at tracker-main.c line 304
  • #15 main
    at tracker-main.c line 383

Comment 2 Matthias Clasen 2013-11-12 13:51:07 UTC
Created attachment 259662 [details]
my tracker-extract -f output

Here is what I am seeing - it still crashes for me in the end, but thats probably bad error handling from this broken situation.

I checked, and the pdf extractor module *is* on my disk, in the right place.
Comment 3 Andreas Nilsson 2013-11-12 14:18:17 UTC
Works in GNOME Continuous updated today.
Comment 4 Debarshi Ray 2013-11-12 14:26:30 UTC
(In reply to comment #2)
> Created an attachment (id=259662) [details]
> my tracker-extract -f output
> 
> Here is what I am seeing - it still crashes for me in the end, but thats
> probably bad error handling from this broken situation.
> 
> I checked, and the pdf extractor module *is* on my disk, in the right place.

I get this when I run it with 'valgrind --tool=memcheck'.
Comment 5 Debarshi Ray 2013-11-12 16:15:36 UTC
Turns out that Tracker's self-imposed memory limits are too low on a VM with 1G RAM for this to work. On such a system the limit is set to 512M.

So, we should either teach Tracker to be smarter about setting the limits or fix poppler to be more efficient so that we don't need more than 512M to index a 64K PDF.

Reassigning to tracker.
Comment 6 Debarshi Ray 2014-10-14 14:01:49 UTC

*** This bug has been marked as a duplicate of bug 737663 ***