GNOME Bugzilla – Bug 635847
Tracker journal is constantly growing and its replaying block tracker-store which blocks other programs depending on tracker
Last modified: 2012-09-26 00:04:05 UTC
tracker-store after start occupies 100% of CPU slowing down the entire system for long time (at least 10 minutes). As the start time tend to be considered more and more important (as shown by distribution efforts to optimize startup scripts down to seconds) it seems to be large error when desktop is not usable during first 15 minuts after start.
I'm not sure if it is related but miner seems to start right after. Tracker probably should have: 1. Low CPU (nice) and IO priority [miner-fs is started with nice 19] 2. Limit on CPU usage if user is active (screen is not locked) 3. Should be started with BATCH scheduler (some time ago it was rejected as it offer no improvement for tracker but IMHO it is much more important to have interactive desktop then to have quick indexing) 4. Should not block other applications that use tracker such as evolution (it may be only my perception but evolution starts with long delay after installing tracker) I understend that developers works on tracker in their free time but this issue seems to outweith any tracker benefit (time spent on starting tracker >>> time I would spent looking for files without tracker).
Sorry for triple postion - to give exact numbers - for evolution from start of process to showing window tookes 9 minutes. With 33 minutes uptime tracker is still replaying journal (ok - it was post-hard reset start but still). It was in 7% of 30/33 journal (I have tracker enabled for a week).
Something looks wrong in the journal replaying... 30 minutes of replay is way too much, and that is probably a bug. How big is the journal (in .local/share/tracker/data/ ). Tracker are two processes: the store (keeping the data) and the miner-fs (providing data from the filesystem). All the suggestions in Comment #1 are mostly valid for the miner-fs. The store still need high priority. Luckily it will mostly solve queries, not disturbing the overall system.
(In reply to comment #3) > Something looks wrong in the journal replaying... 30 minutes of replay is way > too much, and that is probably a bug. How big is the journal (in > .local/share/tracker/data/ ). > % du -sh ~/.local/share/tracker/data/ 386M /home/mpiechotka/.local/share/tracker/data/ And it seems it is still growing (yesterday there was only 24 journals).
(In reply to comment #4) > (In reply to comment #3) > > Something looks wrong in the journal replaying... 30 minutes of replay is way > > too much, and that is probably a bug. How big is the journal (in > > .local/share/tracker/data/ ). > > > > % du -sh ~/.local/share/tracker/data/ > 386M /home/mpiechotka/.local/share/tracker/data/ > > And it seems it is still growing (yesterday there was only 24 journals). 48 journals: % du -sh .local/share/tracker/data 462M .local/share/tracker/data I cannot start evolution and tracker-status often displays errors while quering store. PS. Currently after 2 houres it is doing 33/48.
> I cannot start evolution and tracker-status often displays errors while quering > store. > What I meant is that evolution takes long time to start and today after start it hangs (gdb points many threads waits in send_sparql_update).
I belive that it is more updated info about problem. Currently there is over 0.5 GiB of journal data: % du -sh .local/share/tracker/data/ 544M .local/share/tracker/data If there is any information I can provide please ask (I know too less about tracker internals to be of any other assistance I'm afraid). In log there are entries like: 23 Dec 2010, 03:11:32: Tracker-Warning **: Journal replay error: 'Unable to insert multiple values for subject `(null)' and single valued property `nfo:belongsToContainer' (old_value: '103045', new value: '103211') and sometimes: 23 Dec 2010, 03:11:32: Tracker-Warning **: Journal replay error: 'Unable to insert multiple values for subject `(null)' and single valued property `nie:url' (old_value: 'file:///some/path', new value: 'file:///some/other/path')' and from time to time: 23 Dec 2010, 03:16:57: Tracker-Warning **: Journal replay error: 'constraint failed'
Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data: % du -sh .local/share/tracker/data/ 813M .local/share/tracker/data/
(In reply to comment #8) > Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data: > > > % du -sh .local/share/tracker/data/ > 813M .local/share/tracker/data/ By that are you happy for us to mark this as fixed or what action are you expecting?
(In reply to comment #9) > (In reply to comment #8) > > Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data: > > > > > > % du -sh .local/share/tracker/data/ > > 813M .local/share/tracker/data/ > > By that are you happy for us to mark this as fixed or what action are you > expecting? It seems to not be fixed - replaying is much faster but the tracker journal is still growing: % du -sh .local/share/tracker/data/ 2.1G .local/share/tracker/data/ So at the end bug is still present.
(In reply to comment #10) > (In reply to comment #9) > > (In reply to comment #8) > > > Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data: > > > > > > > > > % du -sh .local/share/tracker/data/ > > > 813M .local/share/tracker/data/ > > > > By that are you happy for us to mark this as fixed or what action are you > > expecting? > > It seems to not be fixed - replaying is much faster but the tracker journal is > still growing: > > % du -sh .local/share/tracker/data/ > 2.1G .local/share/tracker/data/ > > So at the end bug is still present. Sorry for double commenting - the journal is larger then database itself: % du -sh .cache/tracker/ 1.1G .cache/tracker I'm attempting to rebuild database from scratch (and give exact time of doing this).
The journal is currently used for all statements which may include text parts extracted from files. If you're indexing many text documents, the journal can grow quite big, although it should always be smaller than the SQLite database after first-time indexing. I'm working on a branch that won't store extracted text parts (and possibly some more extracted information) in the journal and thus should reduce the journal size a lot in certain environments. Maybe it would make sense if you could retest when that branch is merged.
(In reply to comment #12) > The journal is currently used for all statements which may include text parts > extracted from files. If you're indexing many text documents, the journal can > grow quite big, although it should always be smaller than the SQLite database > after first-time indexing. > > I'm working on a branch that won't store extracted text parts (and possibly > some more extracted information) in the journal and thus should reduce the > journal size a lot in certain environments. Maybe it would make sense if you > could retest when that branch is merged. I will test after merge (so far I will keep bug open to remember to do it). However it grow by 1.2 GB (slightly more then database) over 2 weeks so I don't think it qualifies as "always be smaller than the SQLite database after first-time indexing.".
I've just merged it to master. Let me know if this helps.
(In reply to comment #14) > I've just merged it to master. Let me know if this helps. So it is in .37 & .38? I'm checking it for some time but please give me a few more days.
repopening. I have now tracker that have nearly 800 MiB and the replay now takes 9h and counting.
(In reply to comment #16) > repopening. I have now tracker that have nearly 800 MiB and the replay now > takes 9h and counting. It took about 18h . The problem seems to be correlated with some types or hard restart - for example after unsuccessful resume.
It seems to work in 0.14.