GNOME Bugzilla – Bug 738704
Huge spike in CPU and memory usage by tracker extractor due to rogue file
Last modified: 2014-10-19 11:45:53 UTC
The original bug report was opened against tracker 1.0.x here https://bugzilla.opensuse.org/show_bug.cgi?id=898323 But, the issue is also confirmed with tracker 1.2.2. Essentially the presence of a rogue file (most likely one out of the two attached files) causes a huge memory spike in tracker-extractor (try running /usr/lib/tracker-extractor to test) until finally it complains "Out of Memory" when run from the terminal. The same happens when any use of tracker-extractor is made, e.g., when searching in shell overview. In addition, tracker-control -F reports: --------------------------- Store: 17 Oct 2014, 10:34:56: ✓ Store - Idle Miners: 17 Oct 2014, 10:34:56: ✓ File System - Idle 17 Oct 2014, 10:34:56: ✓ Applications - Idle 17 Oct 2014, 10:34:56: ✓ Userguides - Idle 17 Oct 2014, 10:34:56: ✗ Extractor - Not running or is a disabled plugin --------------------------- Remove these two files from the indexed directories and do tracker-control -s, then try tracker-control -F again and it reports: --------------------------- Store: 17 Oct 2014, 10:29:08: ✓ Store - Idle Miners: 17 Oct 2014, 10:29:08: ✓ File System - Idle 17 Oct 2014, 10:29:08: ✓ Applications - Idle 17 Oct 2014, 10:29:08: ✓ Userguides - Idle 17 Oct 2014, 10:29:08: ✓ Extractor - Idle Press Ctrl+C to stop --------------------------- and the memory leak issue also stops completely. Restore the files back and the issue returns again.
Created attachment 288776 [details] The other suspected file (tarballed because of size constraints) causing tracker to go rogue
The first suspected file causing tracker issues (cannot attach 2 MiB pdf, so posting a link to file instead) https://web-dc1.spideroak.com/share/MJQWI43IMFUDIMBQL5ZWQYLSMUYQ/Bugs/home/badshah/SpiderOak%20Hive/Misc/Bug%20Reports/patt_su3_40.pdf
(In reply to comment #2) > The first suspected file causing tracker issues (cannot attach 2 MiB pdf, so > posting a link to file instead) > https://web-dc1.spideroak.com/share/MJQWI43IMFUDIMBQL5ZWQYLSMUYQ/Bugs/home/badshah/SpiderOak%20Hive/Misc/Bug%20Reports/patt_su3_40.pdf Please get the offending pdf file from here https://drive.google.com/file/d/0B-1aTg5_gkVTSG1FMUhKdlRsWjQ/view?usp=sharing instead. I wish to remove it completely from my system. Did you know on opening Documents now evince-thumbnailer starts trying to thumbnailing this file and all hell breaks loose -- it starts eating memory until my system freezes completely (it clocked a whopping 1.8 GiB before I killed it)? I don't know what's the matter with this file; it is admittedly a very dense plot but acroread opens it okay and yet if I try to open it in evince/Documents this happens, tracker runs into problem also apparently while running its extractor over this one.
I can confirm this bug, but it's not a Tracker bug as far as I can see. We call: text = poppler_page_get_text (page); and we run out of memory and it does take an age to come back from that API call. Reassigning to poppler. https://bugs.freedesktop.org/show_bug.cgi?id=85196
Thanks for taking the time to report this bug. However, this application does not track its bugs in the GNOME Bugzilla.