GNOME Bugzilla – Bug 354467
[blocked] Gtk-based Python applications can be unresponsive with Orca
Last modified: 2007-10-01 15:46:00 UTC
Please describe the problem: When you use Freeloader to download a file, Orca hangs as soon as you "open" the URL for the file. Steps to reproduce: 1. Launch Orca 2. Launch Freeloader 3. Press Control O for the Open URL dialog 4. Enter a URL and press Enter for the OK button Actual results: Freeloader tries to connect but never succeeds and Orca hangs. After killing Freeloader, Orca resumes working. Expected results: Freeloader would connect successfully (it does when Orca is not running) and Orca would not hang. Does this happen every time? Yes. Other information: Orca 1.0, latest Edgy.
Created attachment 72258 [details] Debug.out as requested by Rich
Thanks Joanie.
Hi Joanie: I'm seeing something similar with Freeloader on Ubuntu. I'm looking into it, and have no good insight yet. I want to get to the bottom of this before GNOME 2.17.1, though (see http://live.gnome.org/TwoPointSeventeen).
I've been digging into this some more. I think freeloader may be acting a bit bad here - we're seeing the hang when we attempt to return from handling a keyboard event. Here's the last lines of a trace that shows where Orca is hanging at the end of atspi.py:notifyEvent (the keystroke listener): TRACE orca.atspi:377: if settings.timeoutCallback and (settings.timeoutTime > 0): TRACE orca.atspi:378: signal.alarm(0) TRACE orca.atspi:380: return consumed What we're seeing here is that we've turned off the hang detection code (the call to signal.alarm(0) does this), so we're not detecting the hang on return. There's little we can do here - the next thing to do is to try to figure out why this is happening. I'm not comfortable with pointing fingers just yet, but it looks like freeloader might be getting itself into a deadlock. :-( We might be able to approach hang detection in Orca a little bit differently, however, by doing it in one spot rather than peppering it around the code. I'm not 100% sure, but I think the spot to do this might be an "unrolled" bonobo.main in atspi.py. We'd essentially surround the context.iteration call with calls to the signal handling code: while self.running: if settings.gilSleepTime: time.sleep(settings.gilSleepTime) if settings.timeoutCallback and (settings.timeoutTime > 0): signal.signal(signal.SIGALRM, settings.timeoutCallback) signal.alarm(settings.timeoutTime) context.iteration(False) if settings.timeoutCallback and (settings.timeoutTime > 0): signal.alarm(0) I'm still not sure, however, that this will detect the hang on returning from keystroke handling code. It might, though, and it might be worth a try. Note that the solution here will only end up in a restart of Orca and recover from the hang. It will not solve the overall problem of trying to work well with freeloader.
To follow up more on this - I tried isolating the hang detection code to the point around the context.iteration call and things still hang without us being able to detect it. This may end up being an unfortunate "will not fix," but I'll keep digging for a little while longer.
Add accessibility keyword. Apologies for spam.
Well...I dug and dug on this one and could not come up with any great solution. However, I think this may be one of a class of bugs that we deal with when we process events on objects that have "gone away." This can happen in cases where a window is destroyed and we later refer to that window or objects under that window's hierarchy. An ideal distributed system would throw an exception for this kind of thing, and an ideal distributed system would also provide timeouts for remote method calls. But...I'm not sure we're dealing with an ideal distributed system. ;-) A possible solution (not completely thought out yet) here is this: In atspi.py, keep a dictionary where a key is a window and the value for the key is a list of accessibles under that window's hierarchy. We can add to this list in something like the atspi.py:makeAccessible method. That is, search for the window for a new accessible by looking upward in the hierarchy and add the new accessible to the window's list. In atspi.py, we can also register for window:destroy events. When we receive these, we can mark each accessible in the window's list as invalid (obj.valid = False). Then, in focus_tracking_presenter.py:_processObjectEvent, we ignore events whose sources are objects that are either invalid or DEFUNCT. This might work, but it only works for accessibles that we have a priori knowledge of. Alternatively, we could add a 'window' field to all accessible's in the makeAccessible method (i.e., search up the hierarchy until we get to the child just below the app - there's a new method in util.py called getTopLevel that does this now). By tracking window:destroy events, we could mark each destroyed window as invalid. Then, in focus_tracking_presenter.py:_processObjectEvent, we could ignore events whose sources are objects whose windows are invalid. Something like that. Like I said, not very well thought out, but I wanted to get the idea written down before I won PowerBall.
Created attachment 78398 [details] [review] First cut at a patch to try to fix this bug. For grins, I had a go at making a patch to implement the second approach you suggested. I don't think it's working yet. Freeloader seemed to still hang, but Alt-Tab at that time would allow me to get focus to a gnome-terminal window. What's an example of a valid URL to enter in the Location: field of freeloader's Open URL dialog? Just curious.
Anything you can grab via ftp, e.g.: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/firefox-3.0a1.en-US.linux-i686.tar.bz2
Created attachment 82099 [details] Orca debug.out from recent Orca run trying to download the firefox ftp URL given in comment #9 I tried giving this another run with setting the comm failure limit to 1 and with the recent changes to INVALIDate events if the event.source.valid field wasn't set. No real improvement. The desktop is still hung. Like Will, I too wonder whether the Freeloader folks are doing all that is required to make their Python application properly multi-threaded...
Created attachment 83681 [details] [review] [Bogus] patch to Freeloader to prevent the hang. I downloaded the source code for Freeloader v0.3 from: http://www.ruinedsoft.com/freeloader/freeloader-0.3.tar.bz2 I did the following: 1/ Started Orca 2/ Started Freeloader 3/ Pressed F10 and selected "Open URL..." from the File menu. 4/ Gave it the URL of: http://www.ruinedsoft.com/freeloader/freeloader-0.3.tar.bz2 5/ Pressed Tab twice to get focus to the OK button and pressed Return. Freeloader will then hang until Orca is terminated. I investigated the code and found that when you hit Return with focus on the OK button, it calls the start_url() routine in .../src/freeloader.in. That in turn calls: dl = webdl.WebDL(self.lstore, row, src, des, final, self.watch_file, self.print_log) dl.start() This is in the webdy.py source file. You'll notice that it passes in self.print_log as the last parameter. This is a routine in the freeloader.in file (in the Freeloader class to be exact. In webdl.py, the run() method (called when you do "dl.start()" has the lines: gtk.gdk.threads_enter() self.print_log("Starting Download: " + self.get_tail(), "arrows-down") gtk.gdk.threads_leave() This is calling back into a method in the Freeloader class that looks like: def print_log(self, msg, pb=None): if self.error_flag == 0 and pb == self.stock_error: self.log_toggle.show() self.error_flag = 1 self.log_image.set_from_stock(self.stock_error, gtk.ICON_SIZE_MENU) row = self.logstore.append([pb, time.strftime("%r"), msg, time.time()]) #adj = self.log_tv.get_parent().get_vadjustment() #adj.set_value(adj.upper) return row This is where it's hanging. If I apply the patch in the attachment (in other words, just reurn when we enter this routine), then Freeloader doesn't hang and nicely downloads the requested file. So that's the problem. I don't know the threading magic calls to fix it though.
I think we should close this bug since it does not seem to be a bug in Orca (is that correct, Rich)? We should, however, capture this investigation on the Orca WIKI. We might consider putting it in http://live.gnome.org/Orca/Freeloader, and link to it from http://live.gnome.org/Orca/AccessibleApps, clearly marking it as "inacessible." Thoughts?
I've seen other Python based apps besides Freeloader display blank windows and appear to lock up. Synaptic Package Manager does it for me all the time. The only way to fix/clear it, is to p/kill that application. I agree, the bug isn't in Orca.
> I think we should close this bug since it does not seem to be a bug in Orca (is > that correct, Rich)? We should, however, capture this investigation on the > Orca WIKI. We might consider putting it in > http://live.gnome.org/Orca/Freeloader, and link to it from > http://live.gnome.org/Orca/AccessibleApps, clearly marking it as "inacessible" WIKI updated. I'm going to add Gustavo to the interest list on this bug, though. Perhaps he might have some insight into whether this is a generic PyGtk problem or something isolated to Freeloader.
Created attachment 88702 [details] [review] possible freeloader patch In my opinion it is a freeloader threading bug. It should call gtk.gdk.threads_enter() before entering gtk.main(). But I didn't test the patch; it could uncover other related threading bugs.
Regarding hanging on remote calls, pyorbit 2.14.x supports async calls, which might prove useful to prevent hanging. For an example see: http://svn.gnome.org/viewcvs/pyorbit/trunk/examples/echo/echo-client-async.py?revision=132&view=markup
(In reply to comment #15) > Created an attachment (id=88702) [edit] > possible freeloader patch > > In my opinion it is a freeloader threading bug. It should call > gtk.gdk.threads_enter() before entering gtk.main(). But I didn't test the > patch; it could uncover other related threading bugs. Thanks Gustavo! We definitely agree with you that this is a bug somewhere in the freeloader application. I tried the patch and things still hang. I also tried putting enter/leave calls in the print_log(self, msg, pb=None) method, but that didn't help, either. :-( The reason we're hanging on to this one with such tenacity is that we seem to run into hanging problems a lot with Python-based GTK applications. If there were a magic 'fix the hangs' pill that could be swallowed somewhere lower in the stack, it would have a positive impact across the board.
Marking this one as [blocked] since we're pretty sure it is a freeloader bug. I'm searching for a place to file a bug against freeloader, but it seems as though http://www.ruinedsoft.com/freeloader/ may no longer exist.
Removing target milestone from [blocked] bugs. We have little control over them, so we're better off letting priority and severity be our guide for poking the related components.
We've noticed similar problems with Synaptic Package Manager and Update Manager. So, I'm adjusting the summary of this bug accordingly. I think some work was in PyGtk for some recent version of GNOME 2.19.x, so it might help fix these problems.
I'm obsoleting the first two patches. The [bogus] patch is obvious. The one implementing the suggestion in comment #7 was tried and was an interesting test, but I think it didn't end up fixing the issue.