GNOME Bugzilla – Bug 730118
GtkTreeView very slow since 3.12
Last modified: 2014-10-14 12:41:49 UTC
Since 3.12 filling a GtkListStore is about 50 times slower than before. It gets back to the old speed if I kill "at-spi-bus-launcher" and gets slow again it gets started agan. Another workaround is to set "NO_AT_BRIDGE=1". (The model is removed from the TreeView during filling, so that's not the problem) This affects mostly music players, like Rhythmbox and Quod Libet, which now take 15 seconds to start up in my case (for 14K rows). I'm experiencing the problem on Debian experimental, but there are reports for Arch [0] and Fedora [1], so I think this is an upstream problem. [0] https://bbs.archlinux.org/viewtopic.php?id=180817 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1076914
Is there an isolated testcase for reproducing this? I suppose just creating a liststore and adding 20k items is not the problem and other steps are necessary.
Created attachment 276525 [details] pygobject example (In reply to comment #1) > Is there an isolated testcase for reproducing this? attached (in Python, sorry :/) > I suppose just creating a liststore and adding 20k items is not the problem and > other steps are necessary. It is. The attached script takes 3.725615 here and 0.2521 with NO_AT_BRIDGE=1 or killed at-spi-bus-launcher
you're measuring the time to populate the GtkTreeView, not the backing GtkListStore. if you wanted the latter, you'd have to swap the last two lines of fill_model(). to be precise, filling the ListStore says: time need to fill the model: 0.159151 whereas filling the TreeView with a ListStore says: time need to fill the model: 3.663293 incidentally, I can confirm the issue just by using Rhythmbox with my music collection; filtering as I type has gotten unbearably slow after the update to 3.12 (to the point that the whole UI looks blocked).
Created attachment 276528 [details] sysprof snapshot a snapshot of the dbus-1 usage through sysprof it does seem that we're being awfully chatty over the session bus.
(In reply to comment #3) > you're measuring the time to populate the GtkTreeView, not the backing > GtkListStore. if you wanted the latter, you'd have to swap the last two lines > of fill_model(). Ahh, indeed. I didn't think of that, thanks.
*** Bug 729880 has been marked as a duplicate of this bug. ***
Created attachment 276665 [details] dbus monitor log Attached a dbus-monitor log of the pygobject example. log was retrieved using: dbus-monitor --address `xprop -root | grep AT_SPI_BUS | sed -e 's/.*= "//' | sed -e 's/"$//'` which I found here: http://www.linuxfoundation.org/collaborate/workgroups/accessibility/atk/at-spi/at-spi_on_d-bus#Tips_for_debugging
The source of the problem seems to be AtkObject properties and signals. 1) Gtk:AtkObject:state-change (atk-adaptor/event.c:state_event_listener) triggers for each entry with a value of 0 (why emit state-change if it's not changed?) This lead to the following dbus calls each time: org.a11y.atspi.Event.Object state-changed defunct i 2) AtkObject properties changes (atk-adaptor/event.c:property_event_listener): For each cell(!) in each row AtkObject properties change and this results in lots of these: org.a11y.atspi.Event.Object PropertyChange widget i org.a11y.atspi.Event.Object text-changed insert s org.a11y.atspi.Event.Object PropertyChange accessible-name s org.a11y.atspi.Event.Object children-changed add (so) org.a11y.atspi.Event.Object PropertyChange renderer i 3) Commenting out these two dbus sources makes the bus silent during TV population, but doesn't make things much faster, so I guess the problem is AtkObject emitting those changes and not at-spi2-atk forwarding them.
(In reply to comment #8) > 3) Commenting out these two dbus sources makes the bus silent during TV > population, but doesn't make things much faster, so I guess the problem is > AtkObject emitting those changes and not at-spi2-atk forwarding them. Correction: It also emits "children-changed", and "text-changed" dbus calls (the dbus-monitor call didn't show them for some reason). Removing them as well (in event.c) makes things kinda fast again. The source of the problem still is AtkObject.
I guess we'll need the opinion of the a11y team, at this point. we should batch all changes when setting the TreeModel into a TreeView, instead of iterating over each new row and DoS'ing the session bus; I think this isn't great from an AT user perspective either.
Could someone test if the patch that solved bug 728319 (included on 3.13.1) solves this bug too?
Christoph and Emmanuele, are you using any accessibility technology? I'm wondering why those ATK objects are created in the first place...
(In reply to comment #12) > Christoph and Emmanuele, are you using any accessibility technology? I'm not.
(In reply to comment #12) > Christoph and Emmanuele, are you using any accessibility technology? Not that I know of. I'm using classic mode if that matters.
(In reply to comment #11) > Could someone test if the patch that solved bug 728319 (included on 3.13.1) > solves this bug too? Does not help. My previous 2 comments were using trunk including the patch.
FYI, I'm seeing the same issue (using rhythmbox with a very large music collection). Using NO_AT_BRIDGE=1 makes rhythmbox start way faster and both the gnome-shell and dbus-daemon process aren't pegging the cpu anymore. *but* ... I'm using gtk 3.10.9 (from f20). Hope this helps.
We were talking about this bug on last a11y weekly meeting. Mike Gorse will take a look to this. Anyway, some answers to some comments: (In reply to comment #16) > *but* ... I'm using gtk 3.10.9 (from f20). > > Hope this helps. Thanks, yes, this would help, as confirms that this is happening with both gtk+ 3.12 and 3.10 (In reply to comment #12) > Christoph and Emmanuele, are you using any accessibility technology? FWIW, gnome-shell is right now an accessibility technology, as the magnifier is a built-in feature, not a different application like in GNOME 2.X times. Having said so ... > I'm wondering why those ATK objects are created in the first place... ... yes, this shouldn't happening. We were talking about that on that meeting. The magnifier shouldn't register itself unless the magnifier is active. And that was working when first released, so probably we have a regression here. The conclusion of the meeting was, 1) checking if gnome-shell is registering itself even if it is not needed, 2) also try to minimize the event flood when AT are listening. As said, Mike Gorse is taking a look to this.
I have been taking a look to this bug using the example at comment 2. I know that is not really precise (comment 3), but can be used as a reference, and I get similar output using sysprof. On fedora (gnome-shell 3.10.4, gtk+ 3.10) I get something like this: time need to fill the model: 0.326258 time need to fill the model: 0.293297 time need to fill the model: 0.292656 time need to fill the model: 0.291211 time need to fill the model: 4.029034 And as soon as I start an application listening accessible events (small app using libatspi), I get something like this: time need to fill the model: 4.058328 time need to fill the model: 4.211757 But as soon as I close that application I got back the previous ("good") performance. On a jhbuild session (gnome-shell 3.12.2, gtk+ 3.13) I get always the bad times. But I also noted that this is not only about ChildrenChanged events. Listening the dbus messages, on my fedora session I need to start the small application that register listeners in order to get StateChanged events (ie: if you navigate on the alt+tab switcher), but using the jhbuild session, I got them always. In any case, I found something interesting on the fedora session. (In reply to comment #17) > The conclusion of the meeting was, 1) checking if gnome-shell is registering > itself even if it is not needed On fedora, if I activate the magnifier (so gnome-shell behaves as an accessible tool), I get the bad times with the example. *But*. If I deactivate the magnifier, I still get the bad times. As I said, when using my small application if I close it, I get back the proper performance. So probably the problem with 3.12 is that right now during initialization time the magnifier is activated, and when deactivated, event listening is still registered. Although it is true that we should reduce that number when an accessible tool is listening, we should get a better performance if no AT is listening, so it would be better to focus the effort on that side for now. FWIW: with all this into account, I don't think that this is an gtk+ bug. It is probably an gnome-shell bug (magnifier not properly calling deregistering listeners) or an at-spi2 bug (deregistering listeners not properly working). But, I will not move the bug until we now for sure where is the problem.
Created attachment 277452 [details] [review] Quick hack on gnome-shell A quickly written patch for gnome-shell. The patch calls atspi-exit if not listener are registered. atspi-exit clean some stuff, including what makes keep sending the events to the bus. At one hand, one could argue that this should be done by libatspi when no listeners are registered. On the other hand, I don't see any reason to not clean resources if not needed, so even if the problem is solved at atspi, having this still makes sense to me. This solves the problem to me. Could someone test the patch to confirm that?
I'm not really a fan of making Rhythmbox practically unusable only for accessibility users. I also started to use the Atspi to do notification avoidance when a user is typing in the notification popup area. We need to figure out why model insertions are so slow when a11y is turned on.
Wait what? You can turn off a11y? In the accessibility control panel window I have everything turned off, the icon does not apper in the top bar and yet I suffer from slow GtkTreeView sickness.
Right now we keep all a11y listeners registered as part of caret tracking in gnome-shell, even if the data isn't used. That's what Alejandro's patch fixes above. I want to use the caret tracking data more.
(In reply to comment #20) > I'm not really a fan of making Rhythmbox practically unusable only for > accessibility users. I also started to use the Atspi to do notification > avoidance when a user is typing in the notification popup area. I agree with you. As I mention on comment 17, this should be solved for everyone (what I called 2)). My proposal was not considering the bug solved just with this patch. But it would be difficult to solve the problem without touching a lot of modules for 3.12. The advantage of this solution is that is small enough to be backported to 3.12, and mitigates (somehow) the problem. We could focus on 3.14 to solve the problem as a whole.
(In reply to comment #21) > Wait what? You can turn off a11y? Since 3.6 you can't turn off accessibility. On GNOME 3.X Accessibility is always on. > In the accessibility control panel window I > have everything turned off, That only means that no AT is listening, so the bus traffic should be smaller. This is a bug on that concept. > the icon does not apper in the top bar and yet I > suffer from slow GtkTreeView sickness. As Jasper mentioned, slow GtkTreeview sickness should be solved even if an accessibility tool is running. Patch on comment 19 solves the problem when no accessibility tool is running (so just a partial solution).
1) I'd like to add that I mostly use cell renderer functions with fixed height mode enabled. Those functions are relatively costly but work because in fixed height mode they only get called for each *visible* row (30 vs 10000). I guess rhythmbox does so as well. Please take that into account (maybe limit a11y to visible rows as well?) 2) Is there some way to exclude a widget from all this?
(In reply to comment #25) > 2) Is there some way to exclude a widget from all this? The idea behind the state STATE_MANAGES_DESCENDANT (*), is that if a container has this state, the children don't need to be exposed as in any other container, so any question about current active children/element should be go through the container, and not through navigating all the children. The accessible object of GtkTreeView already has that state, and some of the proposals from Mike Gorse are based on use this state. In any case, improving the situation using this state can be done in parallel with avoiding gnome-shell to send information to the a11y bus if not needed. * This state was already available when I started to play with ATK. Unfourtunately it was never really well documented.
The caret (or focus) tracking mode has to be set to none and the magnifier zoom region has to be active when this 'sync' function is being called, otherwise the tracking gets deregistered because the 'enabled' check comes up false.[1] Maybe I am missing something, but I do not really get the advantage of commits [2] and [3] and I suspect [1] is the problem here but I have no idea what the bug is as I did not test it and since this is the first I learned of a problem in there, I won't be either ;-). The reason I do not get those patches is because the magnifier was already deregistering and registering listeners (without problems as far as I know about it anyway) and these patches seems to be attempting to reinvent the wheel, the long way round and I am not sure why. So, I am not saying they are the reason for your bug (given I have not recreated it, I cannot assume anything that certain) but ith that said, I do suspect that there has been some confusion about what the tracking was doing and how to optimise it and that might well have been how a bug got in. I cannot say for sure, what the cause is but just from reading the comments here, the commits and the code, I am quite sure this would not have been a problem before that commit at least mainly because I believe that what the commit message states it is doing fpr [2] and [3], never actually needed to be done. I could be wrong. Maybe there is more rationale behind them than is obvious from the message, but that is how it looks at face value, at least anway. Hopefully, Giovanni can clarify what the rationale was in greater detail, so it makes more sense. Until then, maybe try reverting and see what you get. [1]https://git.gnome.org/browse/gnome-shell/tree/js/ui/magnifier.js#n858 [2]https://git.gnome.org/browse/gnome-shell/commit/js/ui/magnifier.js?id=3a92aa751fb0517e48e7320159b71196428a7e97 [3]https://git.gnome.org/browse/gnome-shell/commit/js/ui/focusCaretTracker.js?id=3a92aa751fb0517e48e7320159b71196428a7e97
(In reply to comment #27) > The reason I do not get those patches is because the magnifier was already > deregistering and registering listeners (without problems as far as I know > about it anyway) and these patches seems to be attempting to reinvent the > wheel, the long way round and I am not sure why. It is explained on comment 19. Deregistering listeners is not enough to stop the emission of events to the bus. It is also needed to do an atspi_exit() call. Im short what that patch do is: * Only call atspi_init if any listener is registered * Calling atspi_exit if all listeneres are deregistered. > I cannot say for sure, what the cause is but just from reading the comments > here, the commits and the code, I am quite sure this would not have been a > problem before that commit at least mainly because I believe that what the > commit message states it is doing fpr [2] and [3], never actually needed to be > done. It is true that those patches changed some stuff during initialization. But... > I could be wrong. Maybe there is more rationale behind them than is > obvious from the message, but that is how it looks at face value, at least > anway. Hopefully, Giovanni can clarify what the rationale was in greater > detail, so it makes more sense. Until then, maybe try reverting and see what > you get. ... even if those patches are reverted the problem I describe in comment 18 (events are emitted to the bus even after deregister all the listeners) would be present, so adding the call to atspi_exit would be still needed.
I thought the point of explicitly registering listeners was to prevent a flurry of bus traffic. If there are no registered listeners, what bus traffic are we sending out?
(In reply to comment #29) > I thought the point of explicitly registering listeners was to prevent a flurry > of bus traffic. If there are no registered listeners, what bus traffic are we > sending out? Good question. I think Alejandro might be suggesting that Atspi would need to be 'un-inited' to truly close pandoras box. It seems likely that rationale is logically sound based on what we learned about g-s ATSPI from the september control center/ ATSPI bottle neck which froze up GNOME shell, since Mike's workaround seemed to reduce the issue somewhat. That said, from the stack it was clear that the focus-tracking call nailed the lock up and presumably registing the focus-tracking allowed that, not atspi by itself (though actually, I don't think anyone checked this, maybe it's worth looking into...). It is probably worth noting that there seem to not have been any caret tracking dramas locking things up at all since the tracking landed actually, (to my knowledge). Anyway I think Alejandro suggests that the full fury/glory (depending on your perspective on these things) of the ATSPI interface is only unleashed at init so initing alone could mean there were all the signals of the interface flying around though nothing is listening out for them in that case. If that really is the case then it seems preferable to address that, rather than patch up GNOME Shell though, not only to speed up the desktop but also because, if we consider what information must be flying around in that scenario, then this is potentially a security risk that seems like it could be quite easy to exploit in javascript.
(In reply to comment #29) > I thought the point of explicitly registering listeners was to prevent a flurry > of bus traffic. If there are no registered listeners, what bus traffic are we > sending out? Quoting myself (comment #19): > At one hand, one could argue that this should be > done by libatspi when no listeners are registered. On the other hand, I don't > see any reason to not clean resources if not needed, so even if the problem is > solved at atspi, having this still makes sense to me. As I said there, that should be also done at libatspi. So to be clear (as obviously I was not clear enough before): in my opinion there is also a bug on atspi. My idea was creating a different one about that specific issue. But as I said there, atspi_exit also free some resources, so I didn't see any problem to add that call on the magnifier (the current bug right now) when not needed. So in conclusion, do you prefer to just move this bug to atspi and avoid the hassle to add the call to atspi_exit on the magnifier?
I was just curious what bus traffic was going through that atspi_exit(); would stop.
(In reply to comment #32) > I was just curious what bus traffic was going through that atspi_exit(); would > stop. The bus traffic that was being sent are the needed to keep the atspi cache updated. This ensure that an application that just ask for information, but without registering to events, get an updated information. The specific part of the conversation: 14:20:04 <mgorse> As far as events still being sent after deregistering listeners, that is by design at the moment. Ie, an app might still make queries even if it isn't listening for events, so some events are sent no matter what since they update the AT's cache. But maybe it's safe to assume that, if an application isn't listening for any events, then it won't be making queries, so we don't need to worry about updating its cache
(In reply to comment #31) > (In reply to comment #29) > > I thought the point of explicitly registering listeners was to prevent a flurry > > of bus traffic. If there are no registered listeners, what bus traffic are we > > sending out? > > Quoting myself (comment #19): > > > At one hand, one could argue that this should be > > done by libatspi when no listeners are registered. On the other hand, I don't > > see any reason to not clean resources if not needed, so even if the problem is > > solved at atspi, having this still makes sense to me. The thing to hold in mind is that javascript is client side and the interface has functions which allow a script to read keystrokes and button pushes and pretty much anything else it wants, and do so remotely. > > As I said there, that should be also done at libatspi. So to be clear (as > obviously I was not clear enough before): in my opinion there is also a bug on > atspi. My idea was creating a different one about that specific issue. > > But as I said there, atspi_exit also free some resources, so I didn't see any > problem to add that call on the magnifier (the current bug right now) when not > needed. > Would solving that specific issue not in turn, solve this one as a side effect? I am not sure what additional resources would be freed by g-s javascript in this case, that should not be solved by addressing the issue in libastspi. Can you provide a specific example to demonstrate the nuances relating to 'freeing resources' in particular? > So in conclusion, do you prefer to just move this bug to atspi and avoid the > hassle to add the call to atspi_exit on the magnifier?
*** Bug 731078 has been marked as a duplicate of this bug. ***
Created attachment 278826 [details] [review] Initialize atspi on demand Taking into account that the feedback this bug, I'm providing this more simple version, that just initialize atspi on demand. This is also more coherent with the rest of the initialization code, as the listeners are also registered only when needed. This still solves the problem for the users that don't use the magnifier, mitigating the problem somehow. I talked with Jasper via IRC, and he is ok with including this on next gnome-shell 3.12.X release. As said on this bug, IRC and a11y-meetings, this issue in general (to both a11y and non-a11y technologies users) are one of the priority tasks for 3.14. A new bug will be created for that.
Review of attachment 278826 [details] [review]: OK.
(In reply to comment #37) > Review of attachment 278826 [details] [review]: > > OK. Pushed to master and gnome-3-12 branch. Closing bug.
*** Bug 733852 has been marked as a duplicate of this bug. ***
*** Bug 733942 has been marked as a duplicate of this bug. ***
I see this again with 3.14
(In reply to comment #41) > I see this again with 3.14 Same for me. An NO_AT_BRIDGE=1 solves the issue :(
(In reply to comment #41) > I see this again with 3.14 (In reply to comment #42) > (In reply to comment #41) > > I see this again with 3.14 > > Same for me. An NO_AT_BRIDGE=1 solves the issue :( Thanks for notifying the regression. It was also commented on bug 728319. I suspect that both are the same bug (not closing bugs until confirming). Will take a look to the bug.
(In reply to comment #41) > I see this again with 3.14 (In reply to comment #42) > (In reply to comment #41) > > I see this again with 3.14 > > Same for me. An NO_AT_BRIDGE=1 solves the issue :( Yesterday I have uploaded a patch on bug 728319 comment 22. It seems that solves the problem there. As I really suspect that both bugs are the same, if possible, it would be good if you test the patch. Note that the patch is not yet committed as we are doing more tests it with accessible tools, in order to avoid introducing any regression.
Bug 728319 is closed, and the at-spi2-atk patch planned to be included on 3.14.1. Please check this bug again at that moment.
The fix for Bug 728319 got cherry picked in Debian and I can no longer reproduce, so I guess it's fixed again. Thanks Alejandro!