Bug 397787 – Orca non-responsive if OOo goes into crash recovery mode.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 397787 - Orca non-responsive if OOo goes into crash recovery mode.


Summary:	Orca non-responsive if OOo goes into crash recovery mode.


Status:	RESOLVED FIXED

Product:	orca
Classification:	Applications
Component:	general
Version:	2.17.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Orca Maintainers
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:
Blocks:	404411

Reported:	2007-01-17 21:45 UTC by Rich Burridge
Modified:	2007-02-11 19:55 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Orca debug output generated whilst reproducing this problem. (282.67 KB, text/plain) 2007-01-17 21:48 UTC, Rich Burridge		Details
New debug.out Orca output whilst crashing with new test case. (14.52 KB, application/x-gzip) 2007-02-05 23:03 UTC, Rich Burridge		Details
Orca debug.out for a "successful" crash recovery of an OOo document after an OOo Writer crash. (290.39 KB, text/plain) 2007-02-06 18:27 UTC, Rich Burridge		Details
Patch that I thought should have worked but didn't. (1.47 KB, patch) 2007-02-07 17:57 UTC, Rich Burridge	none	Details \| Review
Patch version of what I was describing :-) (4.02 KB, patch) 2007-02-07 19:26 UTC, Joanmarie Diggs (IRC: joanie)	none	Details \| Review
Revised patch based on Will's feedback (4.47 KB, patch) 2007-02-08 20:35 UTC, Joanmarie Diggs (IRC: joanie)	committed	Details \| Review
Orca debug output (at INFO level) during a crash of OOo Writer with latest patch applied. (31.51 KB, text/plain) 2007-02-08 23:08 UTC, Rich Burridge		Details

Description Rich Burridge 2007-01-17 21:45:45 UTC

Steps to reproduce:

1/ Start Orca
2/ Start OOo Calc or Writer and force it to crash. One way of doing
   this with OOo 2.0 Calc is to type Insert/Caps-Lock-f in a spread
   sheet cell.
3/ The OOo Start Error Recovery dialog comes up. Orca goes silent.
4/ If you hit Return to get it past that first Error recovery frame,
   the subsequent frames seem to be accessible.

Comment 1 Rich Burridge 2007-01-17 21:48:31 UTC

Created attachment 80550 [details]
Orca debug output generated whilst reproducing this problem.

See the start of the COMM FAILURES about line 1415.

This was all done under Ubuntu Edgy with OOo v2.0. If anybody
can think of a reliable way to get OOo v2.1 to go into error
recovery mode, I'll try to reproduce this under Ubuntu Feisty.
I'm pretty sure it'll be something similar.

Comment 2 Joanmarie Diggs (IRC: joanie) 2007-01-17 22:09:28 UTC

Hmmmm.... Do you happen to know if Bill's patch for bug #387960 made it into Feisty yet?  Prior to that patch, the stand-alone script you wrote for that bug would reliably crash OOo Writer 2.1 and send it into recovery mode.  

I'm afraid my laptop is patched and my Feisty desktop is being wiped as we speak.  If you have an Edgy box with OOo 2.1, your stand-alone script can be used to crash it....

Comment 3 Rich Burridge 2007-01-17 22:38:12 UTC

Good suggestion. Looks like it's been fixed though. I just ran the
script on my Feisty machine with OOo v2.1 Writer and it no crashee.

Comment 4 Rich Burridge 2007-02-05 19:42:43 UTC

This one is proving difficult to test as I no longer have a reliable 
way to crash OOo. Joanie, have you seen any new behaviour with the 
early builds of OOo v2.2 that might help here? ...

Comment 5 Joanmarie Diggs (IRC: joanie) 2007-02-05 20:31:46 UTC

Rich, why didn't we think of this sooner?  Go browsing the issues list for crashers.  Here's a nice one:  

http://www.openoffice.org/issues/show_bug.cgi?id=72480

I just downloaded the sample document and followed the instructions, backspacing over the text in the object from the end and sure enough, kablooey!

Comment 6 Rich Burridge 2007-02-05 23:03:45 UTC

Created attachment 81961 [details]
New debug.out Orca output whilst crashing with new test case.

Thanks Joanie. That does indeed nicely crash OOo Writer.
Orca is mostly silent whilst this is happening. I've
attached a gzipped compressed version of the Orca
debug.out.

To be further investigated...

Comment 7 Rich Burridge 2007-02-05 23:10:45 UTC

Ugh. Yet again I've had a failure when trying to do an:

% sudo apt-get dist-upgrade

on my Ubuntu Feisty system:

$ sudo apt-get dist-upgrade
Password:
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Calculating upgrade... Done
The following NEW packages will be installed:
  gdebi-core libavahi-core5 libgs-esp8
The following packages have been kept back:
  gnome-app-install tomboy
The following packages will be upgraded:
  app-install-data apport apport-gtk avahi-autoipd avahi-daemon bind9-host
  capplets-data cdrecord dnsutils fastjar gdebi genisoimage
  gnome-control-center gnome-power-manager gs-esp hwdb-client-common
  hwdb-client-gnome intltool libasound2 libasound2-dev libavahi-client-dev
  libavahi-client3 libavahi-common-data libavahi-common-dev libavahi-common3
  libavahi-glib-dev libavahi-glib1 libbind9-0 libdns22
  libgnome-window-settings1 libgtksourceview-common libgtksourceview-dev
  libgtksourceview1.0-0 libisc11 libisccc0 libisccfg1 liblwres9 libslab0
  libsoup2.2-8 libuniconf4.2 libwvstreams4.2-base libwvstreams4.2-extras
  linux-restricted-modules-2.6.20-6-generic linux-restricted-modules-common
  mkisofs popularity-contest python-apport python-apt python-problem-report
  ttf-dejavu ubuntu-docs update-manager update-notifier util-linux-locales
  wodim
55 upgraded, 3 newly installed, 0 to remove and 2 not upgraded.
7 not fully installed or removed.
Need to get 0B/35.8MB of archives.
After unpacking 3281kB of additional disk space will be used.
Do you want to continue [Y/n]? Y
Extracting templates from packages: 100%
Preconfiguring packages ...
Setting up python2.5-minimal (2.5-5ubuntu5) ...
Linking and byte-compiling packages for runtime python2.5...
pycentral: pycentral rtinstall: package python-cairo: already exists: /usr/lib/python2.5/site-packages/cairo/__init__.py
pycentral rtinstall: package python-cairo: already exists: /usr/lib/python2.5/site-packages/cairo/__init__.py
dpkg: error processing python2.5-minimal (--configure):
 subprocess post-installation script returned error exit status 1
Errors were encountered while processing:
 python2.5-minimal
E: Sub-process /usr/bin/dpkg returned an error code (1)

and this is now apparently preventing me from doing an:

% svn update

of Orca to the latest version:

$ svn update
Skipped '.'

Sigh. I'll try again tomorrow.

Comment 8 Joanmarie Diggs (IRC: joanie) 2007-02-05 23:32:54 UTC

I've gotten that sometimes lately w.r.t. Orca.  It seems that python2.5-minimal is not all that fond of unexpected guests in /usr/lib/python2.5/site-packages.  Any chance you built cairo yourself?

When this occurs with Orca, I find that removing /usr/lib/python2.5/site-packages/Orca will convince python2.5-minimal to be installed.  Then I just reinstall Orca.  In your case, it looks like that cairo dir may have to (temporarily) go.

HTH

Comment 9 Rich Burridge 2007-02-06 00:38:41 UTC

That was indeed it. I had to:

% cd /usr/lib/python2.5/site-packages
% sudo rm -rf cairo
% sudo rm -rf orca

Thanks. Hopefully if this is a problem for somebody else, they can google
for an answer and arrive here.

So for all those people: "Hi there!".

Comment 10 Rich Burridge 2007-02-06 18:27:29 UTC

Created attachment 82035 [details]
Orca debug.out for a "successful" crash recovery of an OOo document after an OOo Writer crash.

I tried an interesting experiement. I adjusted:

settings.commFailureAttemptLimit = 1

and reran the steps to cause OOo Writer to crash in OOo issue #72480
http://www.openoffice.org/issues/show_bug.cgi?id=72480

What I now found was that Orca was responsive and spoke the various
OOo popup dialogs that are part of the crash recovery.

So the question now, is what should be done? Should we:

1/ Change the default for this setting in settings.py to always be 1.
2/ Customize the StarOffice.py script for this setting?
3/ None of the above.

Thoughts...

Comment 11 Willie Walker 2007-02-06 20:58:19 UTC

Interesting stuff.  It definitely makes sense that the COMM_FAILUREs are occurring (I think), since OOo just crashed.  The main reason the retry stuff is in there is due to bizarro oddness with the Java platform.  It seemed to have intermittent failures, and a retry would give it opportunity to recover.

I'm not sure I have any words of wisdom right now (I'm jet lagged).  But...

If you go the "always 1" route, what is the impact on the Java platform?

If you go the "Customize the StarOffice.py script for this setting" route, how will it work?  Will it use the per-script settings stuff you put in place?  How will that work in the case of COMM_FAILUREs?

For the "None of the above" route, do you think the thoughts in http://bugzilla.gnome.org/show_bug.cgi?id=400763#c10 might work here?

Comment 12 Rich Burridge 2007-02-06 22:26:58 UTC

> If you go the "always 1" route, what is the impact on the Java platform?

I'll have to get Lynn to answer that. After numerous attempts, I'm still
unable to get the GNOME Java Access Bridget to build for me. Lynn?

> If you go the "Customize the StarOffice.py script for this setting" route, how
> will it work?  Will it use the per-script settings stuff you put in place?  How
> will that work in the case of COMM_FAILUREs?

I'd just set the settings variable in the __init__ method of the StarOffice.py
script. Something like:

Index: StarOffice.py
===================================================================
--- StarOffice.py       (revision 1983)
+++ StarOffice.py       (working copy)
@@ -620,11 +620,15 @@
         #
         self.currentParagraph = None
 
+        # Set the number of retries after a COMM_FAILURe to 1.
+        #
+        settings.commFailureAttemptLimit = 1
+
         # The default set of text attributes to speak to the user. The
         # only difference over the default set in settings.py is to add
         # in "left-margin:" and "right-margin:".
 
-        settings.enabledTextAttributes = "size:; family-name:; weight:400; indent:0; left-margin:0; right-margin:0; underline:none; strikethrough:false; justification:left; style:normal;"
+        settings.enabledTextAttributes = "size:; family-name:; weight:400; paragraph-style:Default; indent:0; left-margin:0; right-margin:0; underline:none; strikethrough:false; justification:left; style:normal;"
 
     def getBrailleGenerator(self):
         """Returns the braille generator for this script.

> For the "None of the above" route, do you think the thoughts in
> http://bugzilla.gnome.org/show_bug.cgi?id=400763#c10 might work here?

No, I'm not convinced that kind of a change would help here. It's a
COMM_FAILURE at a different point. The fact that by setting the retry limit
down to 1 allows us to nicely continue, suggests that this is a different
problem.

Comment 13 Willie Walker 2007-02-07 08:36:09 UTC

> I'd just set the settings variable in the __init__ method of the StarOffice.py
> script.

The unfortunate thing about settings.py, however, is that it is a globally share resource.  As such, any changes to it are seen by all modules.  The focus_tracking_presenter.py:loadAppSettings stuff you did a while back was a start at getting us away from this.  Will the loadAppSettings stuff work in this case?

Comment 14 Rich Burridge 2007-02-07 15:38:44 UTC

> Will the loadAppSettings stuff work in this case?

I think so, but how would you set this for everybody?
Automatically populate each ~/.orca/app-settings/ directory
with a StarOffice.py that set this? What if the user already
had one?

> The unfortunate thing about settings.py, however, is that it is a globally
> share resource.

This really does come back (IMHO) to the need for a __del__
(or whatever) method that can be added to scripts that gets
called when a script is being "unloaded".

If we had that, then the fix would simply be:

In __init__:

    savedCommFailureAttemptLimit = settings.commFailureAttempLimit
    settings.commFailureAttempLimit = 1


in __del__:

    settings.commFailureAttempLimit = savedCommFailureAttemptLimit

Can we consider this for Orca 2.19.X ?

I'd also like to hear back from Lynn on whether globally setting this value
to 1 would have an adverse effect on Java.

Comment 15 Joanmarie Diggs (IRC: joanie) 2007-02-07 15:53:29 UTC

Silly question(s) time.  Feel free to ignore the really dumb ones and/or point me to what I should be reading to see the error of my ways.  :-) :-)

1.  If you put it in the __init__ and restore it in the __del__ won't that still change the value for all apps for the duration of using OOo/StarOffice?

2. Could you do something like add a value to the __init__ of script.py, say:

    self.commFailureAttemptLimit = 5

And then in StarOffice's __init__ override that with a

    self.commFailureAttemptLimit = 1

Comment 16 Rich Burridge 2007-02-07 16:00:41 UTC

Hi Joanie,

> 1.  If you put it in the __init__ and restore it in the __del__ won't that
> still change the value for all apps for the duration of using OOo/StarOffice?

My assumption was that if you were processing an event for a different app
(say something running in the "background"), you'd still go through the:

* unload current script.
* load new script

for each event. 

> 2. Could you do something like add a value to the __init__ of script.py, say:
>
>    self.commFailureAttemptLimit = 5
> 
> And then in StarOffice's __init__ override that with a
>
>    self.commFailureAttemptLimit = 1

This idea looks promising. I'll give it a try this morning. Thanks!

Comment 17 Rich Burridge 2007-02-07 17:57:34 UTC

Created attachment 82093 [details] [review]
Patch that I thought should have worked but didn't.

Well I'm confused. I expected the attached patch to work, but the
__init__ method in the StarOffice.py script is only being called once.
I gave focus back to a gedit window and the comm limit was nicely
reset to 5, but when I gave focus back again to the OOo Writer window,
I did not see it get reset again to 1. 

Will, are we caching script instances such that the script's __init__
method doesn't get recalled? If so, then the workaround might be to move this
logic to somewhere else (like locusOfFocusChanged() )...

Maybe the correct thing to do would be to just fix the problems on the Java
side such that we don't need to retry more than once.

Comment 18 Joanmarie Diggs (IRC: joanie) 2007-02-07 18:10:04 UTC

That's what I was asking about in question #1.

If instead of changing *settings*.commFailureAttemptLimit, you remove it from settings.py entirely and make it a property of each script via: *self*.commFailureAttemptLimit = 5, wouldn't that solve the problem?

Although I know what you mean about expecting something to work and finding that it fails miserably.  So perhaps I should shut up and try what I'm suggesting.  :-) I'll do that now.

Comment 19 Joanmarie Diggs (IRC: joanie) 2007-02-07 19:26:25 UTC

Created attachment 82104 [details] [review]
Patch version of what I was describing :-)

What about this version, Rich?  Seems to solve the problem of the OOo Recovery dialog going silent.  I tried to also test it with Freeloader, but that problem has gotten far worse for me:  If Orca is running it becomes FreeNotLoader.  (I get a process, but no actual app in which to test the open URL commfailures.)

Comment 20 Joanmarie Diggs (IRC: joanie) 2007-02-07 19:57:07 UTC

p.s. When I say that the freeloader problem has gotten worse for me, I don't mean as a result of the patch I attached.   :-) At some point recently, at least on Feisty, freeloader stopped fully loading when Orca is running. :-(

Comment 21 Rich Burridge 2007-02-07 20:29:15 UTC

I just tried your patch. I checked out a totally clean
new orca from svn trunk/HEAD and applied the patch.

This seems to work very nicely. It has my vote for checking
in (modulo removing the debug print statements). Thanks!

Will, is this the way you'd like to see this fixed?

Comment 22 Willie Walker 2007-02-07 22:20:11 UTC

> Well I'm confused. I expected the attached patch to work, but the
> __init__ method in the StarOffice.py script is only being called once.
> I gave focus back to a gedit window and the comm limit was nicely
> reset to 5, but when I gave focus back again to the OOo Writer window,
> I did not see it get reset again to 1. 

__init__ is the Python class constructor, and is thus only called when a new instance of an object is created.  Thus, since OOo is only running once, __init__ will only get called once, regardless if you move focus from app to app or not.  That is, in Orca, a script will exist for the lifetime of the app. 

> Will, are we caching script instances such that the script's __init__
> method doesn't get recalled? If so, then the workaround might be to move this
> logic to somewhere else (like locusOfFocusChanged() )...

This might be the place to do it.  The problem with settings is that we want some to be global across all scripts, and we want others to be local to a script.  I'm not sure we've thought about this enough to really decide how to handle that well.

> Maybe the correct thing to do would be to just fix the problems on the Java
> side such that we don't need to retry more than once.

:-)

Comment 23 Willie Walker 2007-02-07 22:26:00 UTC

(In reply to comment #19)
> Created an attachment (id=82104) [edit]
> Patch version of what I was describing :-)
> 
> What about this version, Rich?  Seems to solve the problem of the OOo Recovery
> dialog going silent.  I tried to also test it with Freeloader, but that problem
> has gotten far worse for me:  If Orca is running it becomes FreeNotLoader.  (I
> get a process, but no actual app in which to test the open URL commfailures.) 

I think this looks pretty good.  The only caution I see is that the active script can sometimes be None.  I think this might be the case if you start Orca with an empty desktop and nothing gets focus.  Then, when something like the clock in gnome-panel issues an event, we might run into a case where the new code gets executed with activeScript = None.

However, I think I may have actually fixed that a while back.  When I look at the activate method for focus_tracking_presenter.py, I see it has:

        orca_state.activeScript = self._getScript(None)

So...maybe I'm concerned about nothing.

Comment 24 Joanmarie Diggs (IRC: joanie) 2007-02-07 22:59:03 UTC

Given that I'm the new kid and learning this stuff as I go, I of course cannot say whether or not you are concerned about nothing.  My guess is that it's a valid issue, but one I don't (yet) fully comprehend. :-)

However, for the particular instance you describe:  When the clock issues an event, the event is associated with gnome-panel, so Orca goes looking for a gnome-panel.py, doesn't find one, and creates a new instance based on default.py, right?  (See snippet from debug.out below)  If so, then doesn't that new instance become activeScript before the event gets processed?

Put more generically, doesn't an (accessible) event necessarily result in there being an activeScript before _processObjectEvent does its thing?

Again, feel free to ignore the dumb questions. :-)

---------> QUEUEING EVENT object:property-change:accessible-name
DEQUEUED EVENT object:property-change:accessible-name <----------

vvvvv PROCESS OBJECT EVENT object:property-change:accessible-name vvvvv
OBJECT EVENT: object:property-change:accessible-name   detail=(0,0)
---------> QUEUEING EVENT object:property-change:accessible-name
    app.name='gnome-panel'        name='Wed Feb  7,  5:43 PM' role='toggle button' state='ENABLED FOCUSABLE SENSITIVE SHOWING VISIBLE' relations=''
Looking for script at orca-scripts.gnome-panel.py...
...could not find orca-scripts.gnome-panel.py
Looking for script at scripts.gnome-panel.py...
...could not find scripts.gnome-panel.py
Looking for toolkit script GAIL.py...
...could not find GAIL.py
NEW SCRIPT: gnome-panel (module=orca.default)
^^^^^ PROCESS OBJECT EVENT object:property-change:accessible-name ^^^^^

DEQUEUED EVENT object:property-change:accessible-name <----------

vvvvv PROCESS OBJECT EVENT object:property-change:accessible-name vvvvv
OBJECT EVENT: object:property-change:accessible-name   detail=(0,0)
    app.name='gnome-panel'        name='Wed Feb  7,  5:43 PM' role='toggle button' state='ENABLED FOCUSABLE SENSITIVE SHOWING VISIBLE' relations=''
^^^^^ PROCESS OBJECT EVENT object:property-change:accessible-name ^^^^^

Comment 25 Willie Walker 2007-02-08 17:53:35 UTC

No question is really dumb.  :-)

> However, for the particular instance you describe:  When the clock issues an
> event, the event is associated with gnome-panel, so Orca goes looking for a
> gnome-panel.py, doesn't find one, and creates a new instance based on
> default.py, right?  (See snippet from debug.out below)  If so, then doesn't
> that new instance become activeScript before the event gets processed?

A new script is definitely created if it doesn't exist.  For our definition in Orca, however, the "active" script is the script instance for the application that currently has keyboard focus.  As such, something like the clock applet will only become the active script when you figure out whatever keyboard gyrations are needed to tab or arrow to it and eventually make your way to it.

Now, for the patch in question, it may make sense to try to look at the script instance associated with the application that emitted the event. For example, you'll see:

s = self._getScript(event.source.app) 

inside the loop retry in default.py.  That script is probably the one whose retryCount should be used.  self._getScript, of course, can also result in a COMM_FAILURE.  So...we'd need to handle that situation as well, which takes us back to falling back to the default retry value, defeating the purpose of trying to limit the number of retries.

As Rich points out, maybe the right thing to do is fix whatever is causing intermittent COMM_FAILUREs when we talk with Java.

Comment 26 Joanmarie Diggs (IRC: joanie) 2007-02-08 20:35:27 UTC

Created attachment 82182 [details] [review]
Revised patch based on Will's feedback

Thanks Will!!

> the "active" script is the script instance for the 
> application that currently has keyboard focus.  

Which means that we'll potentially be in trouble for COMM_FAILURES that occur in the background.... :-(

> Now, for the patch in question, it may make sense 
> to try to look at the script instance associated 
> with the application that emitted the event. 

Yup!  I thought I was doing that.  See previous comment. :-/  In this patch, thanks to Will's help via IM, I believe I really am doing it now. :-)

Rich, could you please give this a try when you get a chance?  Thanks in advance!

> As Rich points out, maybe the right thing to do 
> is fix whatever is causing intermittent COMM_FAILUREs
> when we talk with Java.

I'll leave that to y'all. :-)

Comment 27 Rich Burridge 2007-02-08 23:08:25 UTC

Created attachment 82191 [details]
Orca debug output (at INFO level) during a crash of OOo Writer with latest patch applied.

I took a fresh copy of Orca from svn trunk/HEAD, applied the patch.
I started up gedit and OOo Writer (with the crashes.odt document),
then ran Orca. I'd set the debug level to just INFO.

I followed the steps to reproduce the crash.

You can see from the debug output that it gave up the first COMM_FAILURE
after retrying once. At that point visually on the screen, OOo Writer
had gone and it looks like focus was on the gedit window. We then get
another COMM_FAILURE while processing: "window:activate". This one retries
5 times.But there is only one such COMM_FAILURE "series".

Aurally I nicely heard all the OOo crash recovery dialogs as Writer went
through the process of recovering the crashed document.

It seems we have something here that's a great improvement over 
what was there.Personally I think it's good enough to check in,
but I'll let Mike make the call on that one.

Comment 28 Joanmarie Diggs (IRC: joanie) 2007-02-11 18:18:20 UTC

Will and I were just chatting and decided to go ahead and check it in.  Presumably this bug can be closed as FIXED now.

Comment 29 Joanmarie Diggs (IRC: joanie) 2007-02-11 19:55:06 UTC

Closing as FIXED per Will and Rich.