Bug 635847 – Tracker journal is constantly growing and its replaying block tracker-store which blocks other programs depending on tracker

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 635847 - Tracker journal is constantly growing and its replaying block tracker-store which blocks other programs depending on tracker


Summary:	Tracker journal is constantly growing and its replaying block tracker-store w...


Status:	RESOLVED FIXED

Product:	tracker
Classification:	Core
Component:	Store
Version:	0.10.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	tracker-general
QA Contact:	Jamie McCracken

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-11-26 13:09 UTC by Maciej (Matthew) Piechotka
Modified:	2012-09-26 00:04 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Maciej (Matthew) Piechotka 2010-11-26 13:09:55 UTC

tracker-store after start occupies 100% of CPU slowing down the entire system for long time (at least 10 minutes). As the start time tend to be considered more and more important (as shown by distribution efforts to optimize startup scripts down to seconds) it seems to be large error when desktop is not usable during first 15 minuts after start.

Comment 1 Maciej (Matthew) Piechotka 2010-11-26 13:23:51 UTC

I'm not sure if it is related but miner seems to start right after. Tracker probably should have:

1. Low CPU (nice) and IO priority [miner-fs is started with nice 19]
2. Limit on CPU usage if user is active (screen is not locked)
3. Should be started with BATCH scheduler (some time ago it was rejected as it offer no improvement for tracker but IMHO it is much more important to have interactive desktop then to have quick indexing)
4. Should not block other applications that use tracker such as evolution (it may be only my perception but evolution starts with long delay after installing tracker)

I understend that developers works on tracker in their free time but this issue seems to outweith any tracker benefit (time spent on starting tracker >>> time I would spent looking for files without tracker).

Comment 2 Maciej (Matthew) Piechotka 2010-11-26 13:33:09 UTC

Sorry for triple postion - to give exact numbers - for evolution from start of process to showing window tookes 9 minutes. With 33 minutes uptime tracker is still replaying journal (ok - it was post-hard reset start but still). It was in 7% of 30/33 journal (I have tracker enabled for a week).

Comment 3 Ivan Frade 2010-11-26 14:23:02 UTC

Something looks wrong in the journal replaying... 30 minutes of replay is way too much, and that is probably a bug. How big is the journal (in .local/share/tracker/data/ ).

Tracker are two processes: the store (keeping the data) and the miner-fs (providing data from the filesystem). All the suggestions in Comment #1 are mostly valid for the miner-fs. The store still need high priority. Luckily it will mostly solve queries, not disturbing the overall system.

Comment 4 Maciej (Matthew) Piechotka 2010-11-26 16:42:16 UTC

(In reply to comment #3)
> Something looks wrong in the journal replaying... 30 minutes of replay is way
> too much, and that is probably a bug. How big is the journal (in
> .local/share/tracker/data/ ).
> 

% du -sh ~/.local/share/tracker/data/
386M	/home/mpiechotka/.local/share/tracker/data/

And it seems it is still growing (yesterday there was only  24 journals).

Comment 5 Maciej (Matthew) Piechotka 2010-12-13 15:54:21 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > Something looks wrong in the journal replaying... 30 minutes of replay is way
> > too much, and that is probably a bug. How big is the journal (in
> > .local/share/tracker/data/ ).
> > 
> 
> % du -sh ~/.local/share/tracker/data/
> 386M    /home/mpiechotka/.local/share/tracker/data/
> 
> And it seems it is still growing (yesterday there was only  24 journals).

48 journals:

% du -sh .local/share/tracker/data   
462M	.local/share/tracker/data

I cannot start evolution and tracker-status often displays errors while quering store.

PS. Currently after 2 houres it is doing 33/48.

Comment 6 Maciej (Matthew) Piechotka 2010-12-13 15:56:21 UTC

> I cannot start evolution and tracker-status often displays errors while quering
> store.
> 

What I meant is that evolution takes long time to start and today after start it hangs (gdb points many threads waits in send_sparql_update).

Comment 7 Maciej (Matthew) Piechotka 2010-12-23 02:23:04 UTC

I belive that it is more updated info about problem. Currently there is over 0.5 GiB of journal data:

% du -sh .local/share/tracker/data/
544M	.local/share/tracker/data

If there is any information I can provide please ask (I know too less about tracker internals to be of any other assistance I'm afraid).

In log there are entries like:

23 Dec 2010, 03:11:32: Tracker-Warning **: Journal replay error: 'Unable to insert multiple values for subject `(null)' and single valued property `nfo:belongsToContainer' (old_value: '103045', new value: '103211')

and sometimes:

23 Dec 2010, 03:11:32: Tracker-Warning **: Journal replay error: 'Unable to insert multiple values for subject `(null)' and single valued property `nie:url' (old_value: 'file:///some/path', new value: 'file:///some/other/path')'

and from time to time:

23 Dec 2010, 03:16:57: Tracker-Warning **: Journal replay error: 'constraint failed'

Comment 8 Maciej (Matthew) Piechotka 2011-01-14 02:44:09 UTC

Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data:


% du -sh .local/share/tracker/data/
813M	.local/share/tracker/data/

Comment 9 Martyn Russell 2011-01-31 14:28:59 UTC

(In reply to comment #8)
> Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data:
> 
> 
> % du -sh .local/share/tracker/data/
> 813M    .local/share/tracker/data/

By that are you happy for us to mark this as fixed or what action are you expecting?

Comment 10 Maciej (Matthew) Piechotka 2011-02-02 10:11:22 UTC

(In reply to comment #9)
> (In reply to comment #8)
> > Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data:
> > 
> > 
> > % du -sh .local/share/tracker/data/
> > 813M    .local/share/tracker/data/
> 
> By that are you happy for us to mark this as fixed or what action are you
> expecting?

It seems to not be fixed - replaying is much faster but the tracker journal is still growing:

% du -sh .local/share/tracker/data/
2.1G	.local/share/tracker/data/

So at the end bug is still present.

Comment 11 Maciej (Matthew) Piechotka 2011-02-02 10:14:35 UTC

(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > Seems to be fixed in 0.9.33. However I'm still left with 0.8 GiB of data:
> > > 
> > > 
> > > % du -sh .local/share/tracker/data/
> > > 813M    .local/share/tracker/data/
> > 
> > By that are you happy for us to mark this as fixed or what action are you
> > expecting?
> 
> It seems to not be fixed - replaying is much faster but the tracker journal is
> still growing:
> 
> % du -sh .local/share/tracker/data/
> 2.1G    .local/share/tracker/data/
> 
> So at the end bug is still present.

Sorry for double commenting - the journal is larger then database itself:

% du -sh .cache/tracker/
1.1G	.cache/tracker

I'm attempting to rebuild database from scratch (and give exact time of doing this).

Comment 12 Jürg Billeter 2011-02-02 10:28:38 UTC

The journal is currently used for all statements which may include text parts extracted from files. If you're indexing many text documents, the journal can grow quite big, although it should always be smaller than the SQLite database after first-time indexing.

I'm working on a branch that won't store extracted text parts (and possibly some more extracted information) in the journal and thus should reduce the journal size a lot in certain environments. Maybe it would make sense if you could retest when that branch is merged.

Comment 13 Maciej (Matthew) Piechotka 2011-02-02 10:38:56 UTC

(In reply to comment #12)
> The journal is currently used for all statements which may include text parts
> extracted from files. If you're indexing many text documents, the journal can
> grow quite big, although it should always be smaller than the SQLite database
> after first-time indexing.
> 
> I'm working on a branch that won't store extracted text parts (and possibly
> some more extracted information) in the journal and thus should reduce the
> journal size a lot in certain environments. Maybe it would make sense if you
> could retest when that branch is merged.

I will test after merge (so far I will keep bug open to remember to do it). However it grow by 1.2 GB (slightly more then database) over 2 weeks so I don't think it qualifies as "always be smaller than the SQLite database after first-time indexing.".

Comment 14 Jürg Billeter 2011-02-03 16:36:50 UTC

I've just merged it to master. Let me know if this helps.

Comment 15 Maciej (Matthew) Piechotka 2011-02-16 09:51:43 UTC

(In reply to comment #14)
> I've just merged it to master. Let me know if this helps.

So it is in .37 & .38? I'm checking it for some time but please give me a few more days.

Comment 16 Maciej (Matthew) Piechotka 2011-04-30 07:24:05 UTC

repopening. I have now tracker that have nearly 800 MiB and the replay now takes 9h and counting.

Comment 17 Maciej (Matthew) Piechotka 2011-04-30 18:40:55 UTC

(In reply to comment #16)
> repopening. I have now tracker that have nearly 800 MiB and the replay now
> takes 9h and counting.

It took about 18h . The problem seems to be correlated with some types or hard restart - for example after unsuccessful resume.

Comment 18 Maciej (Matthew) Piechotka 2012-09-26 00:04:05 UTC

It seems to work in 0.14.