GNOME Bugzilla – Bug 659422
please set SCHED_IDLE+IOPRIO_CLASS_IDLE for the miners on Linux
Last modified: 2012-01-04 14:02:52 UTC
Please change the the CPU and IO scheduling parameters of the tracker miners (especially the fs one) so that they only minimally impact behaviour of the system. First, enable SCHED_IDLE via sched_setscheduler() which will ensure the the miners only get CPU when nobody else wants it. Second, enable IOPRIO_CLASS_IDLE via ioprio_set() which will ensure the miners only get IO when nobody else wants it. For the latter you need to manually define the syscall, since glibc still doesn't cover it yet. Use something like this: http://cgit.freedesktop.org/systemd/tree/src/ioprio.h Thanks.
Hi Lennart, (In reply to comment #0) > Please change the the CPU and IO scheduling parameters of the tracker miners > (especially the fs one) so that they only minimally impact behaviour of the > system. I am in two minds about this. We used to have SCHED_IDLE and ioprio_set() a while back and it was removed because the performance of tracker was incredibly bad. See commits: dedaa1d5acfafefa011b5d421a5c3aa5e3895793 ccd2e0dd1506636f2c8ed6169cf56cd726a7b0aa 51de9e6fc346994b615593dd0bcbd738b53a706c I am thinking about making this situational and trying it again. What I had in mind was to use the same approach the screensaver users for being away from the computer and when the user is away, we disable the SCHED_IDLE and enable it when they're back. I also suspect this is an issue mainly for first time indexing or are you seeing this high disk usage even for subsequent checks/starts? Also, it highly depends on your hardware AND the content you have to index. With reasonably fast hardware and not too much content this solution is completely unneeded (at least here). > First, enable SCHED_IDLE via sched_setscheduler() which will ensure the the > miners only get CPU when nobody else wants it. > > Second, enable IOPRIO_CLASS_IDLE via ioprio_set() which will ensure the miners > only get IO when nobody else wants it. > > For the latter you need to manually define the syscall, since glibc still > doesn't cover it yet. Use something like this: > > http://cgit.freedesktop.org/systemd/tree/src/ioprio.h I will try to come up with a patch. I also want to tie this in with existing configuration some how. Currently there is a 'throttle' in the preferences which is a simple mechanic, but I think we should consider that here too. > Thanks. Thanks for the bug report ;)
To be clear, we already set IOPRIO_CLASS_IDLE for tracker-miner-fs and tracker-extract. And while we do not set SCHED_IDLE, we do use nice(19). Do you have reason to believe that SCHED_IDLE would help in practice? I suspect the remaining impact on the system performance is still mainly I/O related.
(In reply to comment #2) > To be clear, we already set IOPRIO_CLASS_IDLE for tracker-miner-fs and > tracker-extract. And while we do not set SCHED_IDLE, we do use nice(19). Do you > have reason to believe that SCHED_IDLE would help in practice? I suspect the > remaining impact on the system performance is still mainly I/O related. Thanks for clearing that up Jürg, I realise re-reading my comment it sounds like we use neither suggestions these days. The related code if you're interested Lennart is here: http://git.gnome.org/browse/tracker/tree/src/libtracker-common/tracker-ioprio.c
(In reply to comment #2) > To be clear, we already set IOPRIO_CLASS_IDLE for tracker-miner-fs and > tracker-extract. And while we do not set SCHED_IDLE, we do use nice(19). Do you > have reason to believe that SCHED_IDLE would help in practice? I suspect the > remaining impact on the system performance is still mainly I/O related. I don't think the question should be whether this helps in practice, but rather whether this is the correct way. And yes, I believe that when we want to tell the system that tracker should only get CPU/IO when nobody else wants then we should tell it exactly that, which SCHED_IDLE/IOPRIO_CLASS_IDLE do, and nice() does not. And even if the effect of these options is not noticeable in many workloads, it's still the right thing to do.
(In reply to comment #3) > http://git.gnome.org/browse/tracker/tree/src/libtracker-common/tracker-ioprio.c You can drop the BE stuff. That's the default anyway.
(In reply to comment #1) > Hi Lennart, > > (In reply to comment #0) > > Please change the the CPU and IO scheduling parameters of the tracker miners > > (especially the fs one) so that they only minimally impact behaviour of the > > system. > > I am in two minds about this. We used to have SCHED_IDLE and ioprio_set() a > while back and it was removed because the performance of tracker was incredibly > bad. First of all, this is not about making tracker faster. This is about having tracker not have an impact on system behaviour otherwise. Secondly, if you think that things got worse than one could reasonably expect by enabling SCHED_IDLE, then file a bug and get it fixed in the kernel. > I am thinking about making this situational and trying it again. What I had in > mind was to use the same approach the screensaver users for being away from the > computer and when the user is away, we disable the SCHED_IDLE and enable it > when they're back. I am not sure you can do that. I suspect going to SCHED_IDLE is a one-way street privilege-wise. > I also suspect this is an issue mainly for first time indexing or are you > seeing this high disk usage even for subsequent checks/starts? Well, I am seeing it all the time, because tracker apparently needs hours and hours to index my $HOME (which isn't even that big...). It has been running since yesterday continuously. > Also, it highly depends on your hardware AND the content you have to index. > With reasonably fast hardware and not too much content this solution is > completely unneeded (at least here). That is not the point. The point is that you should tell the kernel how you want your workload to be scheduled, to minimize the negative effects.
(In reply to comment #6) > Well, I am seeing it all the time, because tracker apparently needs hours and > hours to index my $HOME (which isn't even that big...). It has been running > since yesterday continuously. https://bugzilla.gnome.org/show_bug.cgi?id=659479
(In reply to comment #4) > (In reply to comment #2) > > To be clear, we already set IOPRIO_CLASS_IDLE for tracker-miner-fs and > > tracker-extract. And while we do not set SCHED_IDLE, we do use nice(19). Do you > > have reason to believe that SCHED_IDLE would help in practice? I suspect the > > remaining impact on the system performance is still mainly I/O related. > > I don't think the question should be whether this helps in practice, but rather > whether this is the correct way. And yes, I believe that when we want to tell > the system that tracker should only get CPU/IO when nobody else wants then we > should tell it exactly that, which SCHED_IDLE/IOPRIO_CLASS_IDLE do, and nice() > does not. And even if the effect of these options is not noticeable in many > workloads, it's still the right thing to do. I agree that we should use the correct priority no matter whether the effect is noticeable or not. The issue is that it's not completely clear what the correct priority is. SCHED_IDLE /IOPRIO_CLASS_IDLE definitely sounds correct for first-time indexing in the background. However, tracker-miner-fs also reacts to user actions, either via inotify or by the application explicitly calling IndexFile via D-Bus. Some applications may expect a timely update of the file metadata which might break with idle priorities. This also applies to tracker-store where reasonable update times are important to various applications. Any suggestions how we could deal with this given that we cannot easily switch priorities all the time? As a long-term solution I'd like to see more radical changes to avoid background indexing completely as briefly described in https://bugzilla.gnome.org/show_bug.cgi?id=659025#c5
(In reply to comment #8) <snip> > However, tracker-miner-fs also reacts to user actions, either via inotify or by > the application explicitly calling IndexFile via D-Bus. Some applications may > expect a timely update of the file metadata which might break with idle > priorities. This also applies to tracker-store where reasonable update times > are important to various applications. If the applications need this to happen in a timely manner, they should be the ones driving the indexing. I already mentioned this on a number of occasions, that you wouldn't want your documents indexing be relegated to the end of the queue because the miner is busy indexing your music, or movies. I'll carry on complaining on bug 659025.
(In reply to comment #6) > (In reply to comment #1) > > Hi Lennart, > > > > (In reply to comment #0) > > > Please change the the CPU and IO scheduling parameters of the tracker miners > > > (especially the fs one) so that they only minimally impact behaviour of the > > > system. > > > > I am in two minds about this. We used to have SCHED_IDLE and ioprio_set() a > > while back and it was removed because the performance of tracker was incredibly > > bad. > > First of all, this is not about making tracker faster. This is about having > tracker not have an impact on system behaviour otherwise. Secondly, if you > think that things got worse than one could reasonably expect by enabling > SCHED_IDLE, then file a bug and get it fixed in the kernel. That really depends on your point of view and how well adopted the system is to using Tracker. On the N9, Tracker is very central and performance is paramount. On the desktop, we realise this is less so, but what I was actually referring to was dire performance, not just "a bit slower". These days things might be better, Tracker has changed quite a bit since we last used SCHED_IDLE. It would need some investigation. > > I am thinking about making this situational and trying it again. What I had in > > mind was to use the same approach the screensaver users for being away from the > > computer and when the user is away, we disable the SCHED_IDLE and enable it > > when they're back. > > I am not sure you can do that. I suspect going to SCHED_IDLE is a one-way > street privilege-wise. > > > I also suspect this is an issue mainly for first time indexing or are you > > seeing this high disk usage even for subsequent checks/starts? > > Well, I am seeing it all the time, because tracker apparently needs hours and > hours to index my $HOME (which isn't even that big...). It has been running > since yesterday continuously. Indexing $HOME (entirely) sounds quite wrong. Indexing recursively $HOME is not the normal configuration for Tracker and source directories can take all the longer. Also, how big is "isn't even that big"? :) I suspect this is related to my comment in reply to Bastien: https://bugzilla.gnome.org/show_bug.cgi?id=659025#c8 A number of these reports about Tracker performance have come in over the last week or so from Fedora/Red Hat folk. > > Also, it highly depends on your hardware AND the content you have to index. > > With reasonably fast hardware and not too much content this solution is > > completely unneeded (at least here). > > That is not the point. The point is that you should tell the kernel how you > want your workload to be scheduled, to minimize the negative effects. We are telling the kernel (just not with SCHED_IDLE at this point) and we consider Tracker more I/O bound than CPU bound generally. At the highest loads, Tracker may be (in separate processes): 1. Crawling the file system (tracker-miner-fs) 2. Storing new information to the database - if files changed or first time indexing (tracker-store) 3. Extracting data from each file that changed or is new (tracker-extract). With cold cache, if you use find /usr -name '*tracker*', it's not fast and can be noticeable. That's the equivalent of perhaps one of those processes (tracker-miner-fs). Of course, if there is nothing to do because indexing completed last time we ran, then there is only that crawling as a start up effect on each boot.
(In reply to comment #9) > (In reply to comment #8) > <snip> > > However, tracker-miner-fs also reacts to user actions, either via inotify or by > > the application explicitly calling IndexFile via D-Bus. Some applications may > > expect a timely update of the file metadata which might break with idle > > priorities. This also applies to tracker-store where reasonable update times > > are important to various applications. > > > If the applications need this to happen in a timely manner, they should be the > ones driving the indexing. I already mentioned this on a number of occasions, > that you wouldn't want your documents indexing be relegated to the end of the > queue because the miner is busy indexing your music, or movies. I'll carry on > complaining on bug 659025. Absolutely, which is why we have an API which injects the request into a priority queue. Unfortunately, the application driven only approach breaks quite easily if files are created by different applications. Consider (for example) that Shotwell uses Tracker for image information and EOG saves a file using the Tracker APIs (great), but then you want more editing power and so you turn to GIMP which doesn't support Tracker yet or ever (bad). You then just made Shotwell less useful because any new file you create with GIMP is not known about.
(In reply to comment #10) > > First of all, this is not about making tracker faster. This is about having > > tracker not have an impact on system behaviour otherwise. Secondly, if you > > think that things got worse than one could reasonably expect by enabling > > SCHED_IDLE, then file a bug and get it fixed in the kernel. > > That really depends on your point of view and how well adopted the system is to > using Tracker. On the N9, Tracker is very central and performance is paramount. > On the desktop, we realise this is less so, but what I was actually referring > to was dire performance, not just "a bit slower". This has little to do with desktop or not desktop. It has all to do with correct and not correct. > > Well, I am seeing it all the time, because tracker apparently needs hours and > > hours to index my $HOME (which isn't even that big...). It has been running > > since yesterday continuously. > > Indexing $HOME (entirely) sounds quite wrong. Indexing recursively $HOME is not > the normal configuration for Tracker and source directories can take all the > longer. Humm, it definitely iterates through ~/.local. See bug 659479 I referenced later. And believe me I have not reconfigured tracker in any way. > Also, how big is "isn't even that big"? :) 20G mostly in music. > We are telling the kernel (just not with SCHED_IDLE at this point) How then? SCHED_IDLE is the API for this. What other API are you using?
(In reply to comment #12) > (In reply to comment #10) > > > Well, I am seeing it all the time, because tracker apparently needs hours and > > > hours to index my $HOME (which isn't even that big...). It has been running > > > since yesterday continuously. > > > > Indexing $HOME (entirely) sounds quite wrong. Indexing recursively $HOME is not > > the normal configuration for Tracker and source directories can take all the > > longer. > > Humm, it definitely iterates through ~/.local. See bug 659479 I referenced > later. And believe me I have not reconfigured tracker in any way. Indexing desktop files in ~/.local/share/applications is a special case. This doesn't mean that your whole $HOME is being indexed. And busy loops are obviously not intentional. I haven't seen anything like that before in tracker. If you start tracker-miner-fs with -v 1, it will display the directory hierarchy it's currently crawling (and log it to ~/.local/share/tracker/tracker-miner-fs.log), e.g., Tracker-INFO: Crawling recursively directory '/home/juerg/Documents' Could you check whether it's recursively indexing whole $HOME or just the xdg dirs (Desktop, Documents, Download, Music, Pictures, Videos). $HOME is also indexed by default but not recursively. > > Also, how big is "isn't even that big"? :) > > 20G mostly in music. I assume that if tracker indexed $HOME recursively, there would also be a large amount of source code.
(In reply to comment #12) > (In reply to comment #10) > > > > First of all, this is not about making tracker faster. This is about having > > > tracker not have an impact on system behaviour otherwise. Secondly, if you > > > think that things got worse than one could reasonably expect by enabling > > > SCHED_IDLE, then file a bug and get it fixed in the kernel. > > > > That really depends on your point of view and how well adopted the system is to > > using Tracker. On the N9, Tracker is very central and performance is paramount. > > On the desktop, we realise this is less so, but what I was actually referring > > to was dire performance, not just "a bit slower". > > This has little to do with desktop or not desktop. It has all to do with > correct and not correct. While I agree it is what we would prefer to use, if it doesn't work in our favour, we're not going to use it (and I don't just mean making Tracker fast, but usable). > > > Well, I am seeing it all the time, because tracker apparently needs hours and > > > hours to index my $HOME (which isn't even that big...). It has been running > > > since yesterday continuously. > > > > Indexing $HOME (entirely) sounds quite wrong. Indexing recursively $HOME is not > > the normal configuration for Tracker and source directories can take all the > > longer. > > Humm, it definitely iterates through ~/.local. See bug 659479 I referenced > later. And believe me I have not reconfigured tracker in any way. Yea, it will because they're considered XDG locations for user based desktop files. More recently projects have been using that area (~/.local/share/applications/, etc) for bookmarks or "web applications" for the user. So it's normal that we index that area without any configuration. Also indexing $HOME recursively or entirely shouldn't cover "." prefixed directories unless explicitly defined in the locations/configuration. All hidden directories are ignored. > > Also, how big is "isn't even that big"? :) > > 20G mostly in music. > > > We are telling the kernel (just not with SCHED_IDLE at this point) > > How then? SCHED_IDLE is the API for this. What other API are you using? I consider SCHED_IDLE not the only way here and ioprio_set()/nice() to be two additional ways to improve the situation. There are a number of other things we've done in the past to help too, like trying posix_fadvise(). I don't consider SCHED_IDLE the only approach.
(In reply to comment #13) > If you start tracker-miner-fs with -v 1, it will display the directory > hierarchy it's currently crawling (and log it to > ~/.local/share/tracker/tracker-miner-fs.log), e.g., > > Tracker-INFO: Crawling recursively directory '/home/juerg/Documents' > > Could you check whether it's recursively indexing whole $HOME or just the xdg > dirs (Desktop, Documents, Download, Music, Pictures, Videos). $HOME is also > indexed by default but not recursively. It apparently never makes it beyond looping forever on the .desktop dirs. (BTW, I must say, I am quite disappointed that it won't index my source files by default -- that could have been immensly useful) > > > Also, how big is "isn't even that big"? :) > > > > 20G mostly in music. > > I assume that if tracker indexed $HOME recursively, there would also be a large > amount of source code. Well, a couple of git repos. But not too much.
(In reply to comment #14) > While I agree it is what we would prefer to use, if it doesn't work in our > favour, we're not going to use it (and I don't just mean making Tracker fast, > but usable). Jeez. Get that thinking out of your head. You are not on an island. The full stack is open source. If something doesn't work as expected, then fix it, or file a bug and get somebody else to fix it for you. Working around borked stuff is what makes Windows such a horrible platform. We own our own platform so fix the bugs where they are. Short-term work-arounds are long-term headaches. > > > Also, how big is "isn't even that big"? :) > > > > 20G mostly in music. > > > > > We are telling the kernel (just not with SCHED_IDLE at this point) > > > > How then? SCHED_IDLE is the API for this. What other API are you using? > > I consider SCHED_IDLE not the only way here and ioprio_set()/nice() to be two > additional ways to improve the situation. There are a number of other things > we've done in the past to help too, like trying posix_fadvise(). I don't > consider SCHED_IDLE the only approach. Humpf. The API to tell the kernel that something is an idle thread for CPU is SCHED_IDLE. Full stop. THe other APIs control other things. You shouldn't confuse things. Seriously: fix bugs where they are, the platform is open for hacking.
(In reply to comment #15) > (In reply to comment #13) > > > If you start tracker-miner-fs with -v 1, it will display the directory > > hierarchy it's currently crawling (and log it to > > ~/.local/share/tracker/tracker-miner-fs.log), e.g., > > > > Tracker-INFO: Crawling recursively directory '/home/juerg/Documents' > > > > Could you check whether it's recursively indexing whole $HOME or just the xdg > > dirs (Desktop, Documents, Download, Music, Pictures, Videos). $HOME is also > > indexed by default but not recursively. > > It apparently never makes it beyond looping forever on the .desktop dirs. Seems you're alone on this one. No one else can reproduce this. Does running it with -v 3 show anything more here? > (BTW, I must say, I am quite disappointed that it won't index my source files > by default -- that could have been immensly useful) It depends where you keep them. There's nothing unique about source files which means we don't index them by default.
(In reply to comment #15) > (BTW, I must say, I am quite disappointed that it won't index my source files > by default -- that could have been immensly useful) It's easy to add via tracker-preferences, but it may increase tracker's workload quite a bit as there is a lot of text to index. Even a simple git checkout could mean a lot of work for tracker if many files are different in multiple branches. It might work fine for you but we prefer default configurations that work well on a wide range of systems.
(In reply to comment #17) > (In reply to comment #15) > > (In reply to comment #13) > > > > > If you start tracker-miner-fs with -v 1, it will display the directory > > > hierarchy it's currently crawling (and log it to > > > ~/.local/share/tracker/tracker-miner-fs.log), e.g., > > > > > > Tracker-INFO: Crawling recursively directory '/home/juerg/Documents' > > > > > > Could you check whether it's recursively indexing whole $HOME or just the xdg > > > dirs (Desktop, Documents, Download, Music, Pictures, Videos). $HOME is also > > > indexed by default but not recursively. > > > > It apparently never makes it beyond looping forever on the .desktop dirs. > > Seems you're alone on this one. No one else can reproduce this. I wouldn't be so sure about that. My guess is more that most other folks just get rid of tracker instead of trying to debug it. > Does running it with -v 3 show anything more here? Attached the output to the other bug, we probably should follow that bug there.
(In reply to comment #18) > (In reply to comment #15) > > (BTW, I must say, I am quite disappointed that it won't index my source files > > by default -- that could have been immensly useful) > > It's easy to add via tracker-preferences, but it may increase tracker's > workload quite a bit as there is a lot of text to index. Even a simple git > checkout could mean a lot of work for tracker if many files are different in > multiple branches. It might work fine for you but we prefer default > configurations that work well on a wide range of systems. Multiple branches? tracker does not understand git, right? So it would only see the current checkout, right? Or are you referring to the fact that switching a branch would cause a lot of inotify events? Maybe some logic in tracker to delay crawling through hot directories would help? i.e. the more files in a directory change at the same time, the more delay crawling through it? I can live with having tracker not index my branch changes all the time, as long as it catches up eventually...
(In reply to comment #16) > (In reply to comment #14) > > > While I agree it is what we would prefer to use, if it doesn't work in our > > favour, we're not going to use it (and I don't just mean making Tracker fast, > > but usable). > > Jeez. Get that thinking out of your head. You are not on an island. The full > stack is open source. If something doesn't work as expected, then fix it, or > file a bug and get somebody else to fix it for you. Working around borked stuff > is what makes Windows such a horrible platform. We own our own platform so fix > the bugs where they are. Short-term work-arounds are long-term headaches. Please don't lecture me on Linux and open source in these terms. I am fully aware of all that. I come from Windows and I know of its shortfalls (especially regarding closed source non-fixable situations) and it's why I got into Linux. It still doesn't change a thing. If it doesn't work, we're not going to use it until it's fixed are we (if indeed it needs fixing and that's still not clear to me) - let me remind you, this is _EXACTLY_ the same suggestion you have for Tracker right now (i.e. disable it until it works). SCHED_IDLE certainly renders Tracker less aggressive on systems using it but to the point of not even responding to simple dbus methods (last I tried under busy loads). But now we have a new kernel etc, it might have improved. I've not checked. We should test it again. > > > > Also, how big is "isn't even that big"? :) > > > > > > 20G mostly in music. > > > > > > > We are telling the kernel (just not with SCHED_IDLE at this point) > > > > > > How then? SCHED_IDLE is the API for this. What other API are you using? > > > > I consider SCHED_IDLE not the only way here and ioprio_set()/nice() to be two > > additional ways to improve the situation. There are a number of other things > > we've done in the past to help too, like trying posix_fadvise(). I don't > > consider SCHED_IDLE the only approach. > > Humpf. The API to tell the kernel that something is an idle thread for CPU is > SCHED_IDLE. Full stop. THe other APIs control other things. You shouldn't > confuse things. Perhaps I should have been more precise. I meant, we're already telling the kernel with a number of APIs how we should be handled. This bug is not solely about scheduling it's also about I/O priority. In any case, there is no point to this line of argument. I was merely pointing out that we're doing 1 of the 2 things you suggested in your first comment already. -- As a side note, can we please calm the general rhetoric down on this thread. I don't appreciate your assumptions about our thinking and fixing things outside our immediate project scope. I would rather stay objective here.
(In reply to comment #21) > Please don't lecture me on Linux and open source in these terms. I am fully > aware of all that. I come from Windows and I know of its shortfalls (especially > regarding closed source non-fixable situations) and it's why I got into Linux. You know, yesterday I introduced Jürg to Eric Paris on IRC, and tracker folks and fanotify talks talked with each other for the first time. You guys never talked before apparently. I am not sure why I need to play the messenger here between you guys, but uhm, all this should have happened 3 years ago, after GC. You say you don't want to be lectured, but then you should have been aware that things in Free Software don't generally fix themselves by just waiting. You need to get your hands dirty yourself. And if not in fixing things yourself then at least in talking to the right people, and that's often easy. Especially in Eric's case.
So I did some testing with SCHED_IDLE again to see how things fair when using it. The only disadvantage I could see so far is that some of the daemons are not responsive for long periods of time when the system is busy (e.g. tracker-control -F took > 30 seconds to respond at one point). That was the reason we disabled it before. There is one important difference with today's tracker. Now tracker-store is a separate process, all queries and updates are done there not in the same process that crawls the file system. This is a good thing of course. It also means we can use SCHED_IDLE on just tracker-miner-fs and tracker-extract which are the two biggest CPU and I/O users. -- I have created a branch called SCHED_IDLE here: http://git.gnome.org/browse/tracker/log/?h=SCHED_IDLE It also creates a preferences radiobutton group for configuration (which affects only tracker-miner-fs and tracker-extract), it looks like this: http://www.lanedo.com/~martyn/images/tracker-preferences-with-sched-idle2.png -- I did some tests with the new branch with and without SCHED_IDLE. In all tests I used the following conditions: - I make sure the log verbosity is "detailed" (more than usual and this uses I/O writing to the disk of course). - I remove all databases (with tracker-control -r) - I echo 3 to /proc/sys/vm/drop_caches before starting. - I start with tracker-control -s - Do nothing during indexing determine maximum speed during idle times. I used two target machines to get some idea of how this affects different environments: - Laptop: Thinkpad X301, 2GB memory 2 x Intel Core 2 Duo U9400 @ 1.40 GHz 55 GB SSD - Desktop: Self built, 4GB memory 4 x Intel Core 2 Quad Q6600 @ 2.40 GHz 500 GB HDD Results on the Laptop: - With SCHED_IDLE: 23 Sep 2011, 13:56:33: Tracker: Finished mining in seconds:502.496275, total directories:3382, total files:40045 - Without SCHED_IDLE: 23 Sep 2011, 14:06:14: Tracker: Finished mining in seconds:499.514653, total directories:3382, total files:40045 Results on the Desktop: - With SCHED_IDLE: 23 Sep 2011, 13:36:02: Tracker: Finished mining in seconds:3427.326958, total directories:11718, total files:126149 - Without SCHED_IDLE: 23 Sep 2011, 14:43:34: Tracker: Finished mining in seconds:3382.015331, total directories:11720, total files:126152 -- Conclusion is that regardless of hardware or scheduler setting here, when the machine is idle the performance difference of Tracker is negligible. I also did tests on the laptop (not mentioned above) with smaller data sets of around 2k files and the difference was ~1 second. So I am quite happy that we could use SCHED_IDLE and still be responsive in tracker-store to queries/updates. For now, in the branch, the preferences has the default value of using SCHED_IDLE when on our first index only. What do people think about that? Given the results above, should we opt to turn this on my default all the time and let power users (or those that care) configure it otherwise? Other comments?
> Well, I am seeing it all the time, because tracker apparently needs hours and > hours to index my $HOME (which isn't even that big...). It has been running > since yesterday continuously. >> Also, how big is "isn't even that big"? :) > 20G mostly in music. I've noticed that music files can sometimes be really slow to extract - I filed https://bugzilla.gnome.org/show_bug.cgi?id=659954 to keep track, hopefully I'll get some time to look at this in the next couple of weeks.
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report. This will be released in 0.12.3 today.