GNOME Bugzilla – Bug 767075
Segfault in atspi_accessible_clear_cache() - and reconsider recursive cache clearing
Last modified: 2021-07-05 10:44:56 UTC
Created attachment 328810 [details] backtrace Orca users have reported that Orca sometimes hangs when apps are quit. I cannot reproduce this reliably yet, but I'm attaching the backtrace from one case where I could. I'll do my best to work around this in Orca, but I do not *think* Orca is causing this. For instance, given the segfault here: Thread 1 "orca" received signal SIGSEGV, Segmentation fault. atspi_accessible_clear_cache (obj=0x5605f7942120) at atspi-accessible.c:1637 1637 for (i = 0; i < obj->children->len; i++) Orca called clearCache() on a window.
Looking at atspi_accessible_clear_cache(), it seems that's recursive. In the case of a window that has gone defunct, seems like a not-good thing to do. The reason Orca clears the cache for windows is to ensure it has the latest states to determine if the window is defunct. Is it really necessary and desired to clear the cache on all the descendants?
Created attachment 328995 [details] Python script to trigger the crash I found a way to reliably reproduce this and do it without Orca. In fact the attached pyatspi listener is dead simple. Steps to reproduce in a GNOME Shell session: 1. Quit all apps except a terminal in which to run the listener. 2. Launch the listener. 3. Press Super to get into the overview. 4. Wait a second or two. 5. Press Super again to exit the overview. 6. Wait for a while. As you'll see from the listener, the window events will result in the cache being cleared. And as stated earlier, they are being cleared recursively. GNOME Shell has a ton of descendants. The listener also listens for all object:state-changed events and prints the event app, event type, and event source. Getting into and out of the overview results in a huge number of events being spewed. This is why you have to wait a while after step 6. But if I wait long enough for it all to get spewed out, I reliably get the segfault from the opening report: Thread 1 "python3" received signal SIGSEGV, Segmentation fault. atspi_accessible_clear_cache (obj=0x555555abf540 [AtspiAccessible]) at atspi-accessible.c:1637 1637 for (i = 0; i < obj->children->len; i++)
I think that I fixed this one: master: dcb353 gnome-3-20: c565ec Leaving the bug open because I haven't done anything about the function recursing. I don't kjnow if anyone is using the function aside from orca, but I'm hesitant to create a situation where the function behaves differently depending on which version of AT-SPI is being used, so I'm thinking of adding a atspi_clear_cache_full() that will allow the caller to specify whether to recurse.
Sorry, I had not noticed the commit. I just built and installed the Fedora 3.20 at-spi2-core packages after having applied your patch. I can no longer reproduce the crash using the listener I attached. So I think you fixed it too. Thanks!! As for the recursion, I'm fine with there being a new method created and doing the version check in Orca.
Ping? I just read an announcement on the Orca list in which an Orca crash is <fingerquotes>fixed</fingerquotes> by removing Orca's call to clear cache on window objects and then adding a new option to re-enable clearing cache. That seems like 1000 different kinds of sad -- not to mention introducing bugs as a result of Orca having stale information about windows.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/at-spi2-core/-/issues/ Thank you for your understanding and your help.