Bug 491756 – Performance Metabug

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 491756 - Performance Metabug


Summary:	Performance Metabug


Status:	RESOLVED OBSOLETE

Product:	orca
Classification:	Applications
Component:	general
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	Metabug
Assigned To:	Orca Maintainers
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:	375782 446277 480881 490568 500016 504742 506360 508163 508675 508784 517326 517759 518923 529730 543734 547774 547880 550800 561722 563171 565670 565886 568158 583289 591924 599361 616848 622864 638210 638970 639685 660218
Blocks:

Reported:	2007-10-30 15:03 UTC by Willie Walker
Modified:	2013-01-02 15:53 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
reduce getApplication() calls (1.15 KB, patch) 2007-11-15 19:48 UTC, Eitan Isaacson	committed	Details \| Review
Make a ring buffer composing of passes locus of focuses (1.12 KB, patch) 2007-11-15 19:51 UTC, Eitan Isaacson	none	Details \| Review
Limit amount of script listeners (9.35 KB, patch) 2007-11-15 23:37 UTC, Eitan Isaacson	none	Details \| Review
Make a ring buffer composing of passes locus of focuses (1.37 KB, patch) 2007-11-15 23:52 UTC, Eitan Isaacson	committed	Details \| Review
Test results (41.40 KB, application/vnd.oasis.opendocument.spreadsheet) 2007-11-16 00:42 UTC, Eitan Isaacson		Details
Test results (30.00 KB, application/vnd.oasis.opendocument.spreadsheet) 2007-11-16 01:52 UTC, Eitan Isaacson		Details
Limit listeners (12.38 KB, patch) 2007-11-28 21:52 UTC, Eitan Isaacson	needs-work	Details \| Review
Limit listeners (11.72 KB, patch) 2007-11-30 18:39 UTC, Eitan Isaacson	none	Details \| Review
Limit listeners (11.58 KB, patch) 2007-11-30 19:24 UTC, Eitan Isaacson	none	Details \| Review
Limit listeners (11.77 KB, patch) 2007-12-04 20:27 UTC, Eitan Isaacson	committed	Details \| Review
Patch to cache displayedText and displayedLabel computation (2.27 KB, patch) 2009-06-23 02:47 UTC, Willie Walker	committed	Details \| Review

Description Willie Walker 2007-10-30 15:03:51 UTC

This bug is used for tracking all performance problems with Orca: speed, cpu, etc.

Comment 1 Eitan Isaacson 2007-11-15 19:48:56 UTC

Created attachment 99159 [details] [review]
reduce getApplication() calls

This patch utilizes the host_application field in events to get the source's application. We do this for almost every event, so it could help.

Comment 2 Eitan Isaacson 2007-11-15 19:51:57 UTC

Created attachment 99161 [details] [review]
Make a ring buffer composing of passes locus of focuses

This patch adds a ring buffer of five elements that keeps a reference to the last five locus of focuses. This helps with pyatspi's caching scheme which is based on references to local accessible objects. This should help improve the caching of context.

Comment 3 Willie Walker 2007-11-15 19:56:02 UTC

(In reply to comment #1)
> Created an attachment (id=99159) [edit]
> reduce getApplication() calls
> 
> This patch utilizes the host_application field in events to get the source's
> application. We do this for almost every event, so it could help.
> 

This looks good to me.  The assumption being made is that since people will need the latest pyatspi to work with Orca, we will definitely be getting the 'new' style of AT-SPI events that have the host_application field, right?

Comment 4 Willie Walker 2007-11-15 19:59:50 UTC

> This patch adds a ring buffer of five elements that keeps a reference to the
> last five locus of focuses. This helps with pyatspi's caching scheme which is
> based on references to local accessible objects. This should help improve the
> caching of context.

Ha!  This one also squeaks in the "<= 256" patch.  I think that one should go in regardless, though it probably should be "<= 255" or "< 256" so we're only creating masks with 8 significant bits.  

For the ring buffer, it would be cool if we could a orca.settings property that we could monkey with to set the size of the list rather than hardcoding it to 4.  That way, we might be able to quickly experiment with how different numbers help us and/or whether or not the ring buffer idea is even useful.

Comment 5 Eitan Isaacson 2007-11-15 23:06:50 UTC

> This looks good to me.  The assumption being made is that since people will
> need the latest pyatspi to work with Orca, we will definitely be getting the
> 'new' style of AT-SPI events that have the host_application field, right?
> 

No assumptions are really made, it uses host_application if it is not None, and if it is it does getApplication().

The host_application field has been in pyatspi Event wrappers since I remember.

Comment 6 Eitan Isaacson 2007-11-15 23:07:55 UTC

> Ha!  This one also squeaks in the "<= 256" patch.  I think that one should go
> in regardless, though it probably should be "<= 255" or "< 256" so we're only
> creating masks with 8 significant bits.  
> 

I'll check that in today.

> For the ring buffer, it would be cool if we could a orca.settings property that
> we could monkey with to set the size of the list rather than hardcoding it to
> 4.  That way, we might be able to quickly experiment with how different numbers
> help us and/or whether or not the ring buffer idea is even useful.
> 

I'll do that, and attach a new patch.

Comment 7 Eitan Isaacson 2007-11-15 23:37:36 UTC

Created attachment 99178 [details] [review]
Limit amount of script listeners

This patch assigns specific listeners for state-changed events. Also, every listener that was assigned to noOp no only gets assigned if the debug level is high enough.

Comment 8 Eitan Isaacson 2007-11-15 23:52:15 UTC

Created attachment 99179 [details] [review]
Make a ring buffer composing of passes locus of focuses

This takes out the unrelated mask thing above. It also makes a setting called settings.focusHistoryLength for adjusting the buffer size.

Comment 9 Eitan Isaacson 2007-11-16 00:42:04 UTC

Created attachment 99183 [details]
Test results

These are numbers of orca running with three consecutive tests from gtk-demo: role_menu.py, role_tree_table.py and role_combo_box2.py.

I compared Pre-pyatspi Orca, Orca from trunk (with the new pyatspi caching changes), and Orca with all of the patches that are in this bug report. 

This is the sum of call counts/times:

Pre-Pyatspi		Orca trunk		Orca patched	
24541	3.5692630000	44009	5.9319880000	18899	2.9330390000

Comment 10 Eitan Isaacson 2007-11-16 01:52:19 UTC

Created attachment 99184 [details]
Test results

Disregard the last results. It was my first attempt, so of course I failed.
Anyway here are more "sane" results:

Pre-Pyatspi		Orca Trunk		Orca Patched	
24541	3.5692630000	44009	5.9319880000	29473	3.8884460000

Comment 11 Willie Walker 2007-11-26 20:25:59 UTC

Two patches look great.  I need to look at the other one, which is probably the most important one, just a little more.  The main concerns I have with it are:

1) It modifies behavior based upon debug level.  I think it's probably OK to do this, but I wonder if we might end up in situations where we might be confused about getting/handling events one way, but not the other.  I wonder if maybe we need a different setting (e.g., settings.optimizeEventListeners or something like that) to do the selective listening?

2) I don't see a way to turn on listening for all object:state-changed events -- we will also use the Orca debug logs as a means to help reverse engineer what the application is giving us, so it is nice to get as much as we can when we want it.  Is there a way for us to listen for all object:state-changed events based upon a debug level setting?

Anyhow, getting us faster is a good thing.  :-)  Good stuff.

Comment 12 Eitan Isaacson 2007-11-28 21:52:22 UTC

Created attachment 99806 [details] [review]
Limit listeners

The new patch introduces settings.listenAllEvents, if set to True it will listen to all the useless events, except for "mouse:*" and whatever is in settings.ignoredEventsList.

This should address both of the issues you brought up, Will.

Comment 13 Willie Walker 2007-11-28 22:33:30 UTC

(In reply to comment #12)
> Created an attachment (id=99806) [edit]
> Limit listeners
> 
> The new patch introduces settings.listenAllEvents, if set to True it will
> listen to all the useless events, except for "mouse:*" and whatever is in
> settings.ignoredEventsList.
> 
> This should address both of the issues you brought up, Will.
> 

Looking really good.  The one problem area I see is this:

+            if e.type in settings.ignoredEventsList:
+                return
             # We ignore defunct objects.
             #
             if e.type.startswith("object:state-changed:defunct"):

Around this part of the code, there are several blocks that toss out events:

            if e.type.startswith("object:state-changed:defunct"):
...
            if e.type.startswith("object:property-change:accessible-parent"):
...
            if e.type.startswith("object:children-changed:remove") \
                and (e.source != self.registry.getDesktop(0)):

The general pattern is that we do this if we want to toss an event:

                if settings.debugEventQueue:
                    self._enqueueEventCount -= 1
                return

So, unless you want to try to glom all the blocks into one just to be perty, the above change probably could just be:

+            if e.type in settings.ignoredEventsList:
+                if settings.debugEventQueue:
+                    self._enqueueEventCount -= 1
+                return

In looking at the way all events are registered (via pyatspi.EVENT_TREE), I'm still curious if we can just prevent some events from being listened for at all in settings.listenAllEvents=True mode if they exist in settings.ignoredEventsList.  It *seems* like traversing pyatspi.EVENT_TREE.values() might help us here.  The only risk is if pyatspi.EVENT_TREE is incomplete.  If that's the case, there might be the potential for us not to get events that we expect to get.  In any case, things look fine as they are and we can get fancier if the performance ends up sucking rocks if listenAllEvents=True.

Comment 14 Eitan Isaacson 2007-11-30 18:37:05 UTC

(In reply to comment #13)
> It *seems* like traversing
> pyatspi.EVENT_TREE.values() might help us here.  The only risk is if
> pyatspi.EVENT_TREE is incomplete.  If that's the case, there might be the
> potential for us not to get events that we expect to get.  

Exactly, the main problem is my view is that we could say have a blacklist with "object:bounds-changed" in it. With pyatspi.EVENT_TREE we could register all the leaf event types except for "object:bounds-changed".

But what if we have in our blacklist "object:state-changed:stale"? All of the subtypes of object:state-changed are not in EVENT_TREE, so there is no way of registering other "object:state-changed" events without registering "object:state-changed:stale".

Comment 15 Eitan Isaacson 2007-11-30 18:39:25 UTC

Created attachment 99921 [details] [review]
Limit listeners

This adds the stuff Will wanted above (the debug things).

Comment 16 Eitan Isaacson 2007-11-30 19:24:25 UTC

Created attachment 99922 [details] [review]
Limit listeners

Got rid of conflict in gedit.py.

Comment 17 Willie Walker 2007-11-30 19:34:53 UTC

(In reply to comment #16)
> Created an attachment (id=99922) [edit]
> Limit listeners
> 
> Got rid of conflict in gedit.py.
> 

Looks good to me.  Mike - please test.  Eitan, please also run the regression tests and look for unexpected failures.  If this is a good fix that boosts performance, we should try to get it in for 2.21.3.  Thanks!

Comment 18 Eitan Isaacson 2007-12-04 20:27:13 UTC

Created attachment 100201 [details] [review]
Limit listeners

This fixes expand/collapsed and checked/unchecked events.

Comment 19 Mike Pedersen 2007-12-05 01:34:16 UTC

After a few hours of testing this patch seems to be working pretty well.  I think for non-mozilla applications we are now getting near a pre-pyat-spi state.  One thing to perhaps try is the following:
1.  Open up openoffice writer so that you have a blank document showing.
2.  Press CTRL+o for the file open dialog.  
This action still seems to me to be just a bit slower.  
I'll leave my name on this bug and keep testing.

Comment 20 Eitan Isaacson 2007-12-07 19:42:29 UTC

Thanks Mike for pointing out that hotspot.

The Open File dialog is the same dialog used in gedit, and it seems to be slow there too, could you confirm that?

From my experience I noticed that most openoffice dialogs, besides the open file dialog, are pretty quick, for example the spellcheck and printing dialog. Is the slugishness you are experiencing only with the Open File dialog, or with all openoffice dialogs?

Comment 21 Mike Pedersen 2007-12-07 20:35:13 UTC

It probably just seems a bit slower because of the overhead of office.  It is probably a good idea to check this patch in so the community can test it as well.

Comment 22 Willie Walker 2007-12-07 22:13:06 UTC

Eitan, if you're confident with your testing of this fix, and it pylints OK (try running ./run_pylint.sh from the top level directory -- it will automatically pylint files you've modified/added -- everything should be a 10.00 before you check the files in), I say check it in.

Comment 23 Eitan Isaacson 2007-12-08 00:02:54 UTC

Depending on the number of files in your home directory, the Open dialog will be slower. I have 16 files, and if takes 1.5 seconds for the Open dialog to open, also in pre-pyatspi this dialog feels heavy on Orca, but it's a couple tenths of a second faster.

On my computer I get about 200 events when this dialog appears, as opposed to about 20 when a Print dialog appears.

Will, I'll submit this patch now (after pylint).

Comment 24 Eitan Isaacson 2007-12-08 02:44:48 UTC

Committed.
The pylint was not perfect, it seems that default.py and OpenOffice.py have issues not related to this patch.

Comment 25 Rich Burridge 2007-12-08 04:46:05 UTC

> The pylint was not perfect, it seems that default.py and OpenOffice.py have
> issues not related to this patch.

Very strange. They are 10.00 for me.

richb@richb-laptop:~/gnome/orca/trunk$ ./run_pylint.sh  src/orca/default.py 
Thank you for your attention to quality

Checking default.py, sending output to default.pylint
Your code has been rated at 10.00 
richb@richb-laptop:~/gnome/orca/trunk$ ./run_pylint.sh src/orca/scripts/StarOffice.py 
Thank you for your attention to quality

Checking scripts/StarOffice.py, sending output to StarOffice.pylint
Your code has been rated at 10.00 
richb@richb-laptop:~/gnome/orca/trunk$

Comment 26 Willie Walker 2007-12-10 15:22:03 UTC

(In reply to comment #25)
> > The pylint was not perfect, it seems that default.py and OpenOffice.py have
> > issues not related to this patch.
> 
> Very strange. They are 10.00 for me.

They are good for me, too.  Eitan, what are the issues?

PS - I'm pysched to get performance back to the pre-pyatspi days!  Yeah!  From here on, Mike is going to help us find specific slow areas that we can concentrate on.  

For one more performance opportunity before closing out this bug, we might consider changing the debug support to not make AT-SPI calls if we're guaranteed that there will be no related debug output for a given debug level.  That should hopefully give us a nice performance boost.

Comment 27 Willie Walker 2007-12-18 13:42:48 UTC

> For one more performance opportunity before closing out this bug, we might
> consider changing the debug support to not make AT-SPI calls if we're
> guaranteed that there will be no related debug output for a given debug level. 
> That should hopefully give us a nice performance boost.

Turns out we are already doing this.  :-)  I'm going to leave this bug open as a means to quickly find all the performance problems, but keep it assigned to orca-maint.

Comment 28 Joanmarie Diggs (IRC: joanie) 2008-01-11 08:07:42 UTC

Out of curiosity, I ran some tests to compare where we are at the bleeding-edge moment with where we were before the line-related enhancements in Gecko.py.  With no further ado:

              Revision 3387 (pre-Gecko performance) 
                               vs 
        Current Orca plus latest patch to Bug 506360 (rev.4)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 1. Reading the Wiki by Line
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

         4166839 function calls in 201.149 CPU seconds
          958015 function calls in 139.656 CPU seconds

-----------------------------------------------------------------
                  getLineContentsAtOffset

        ncalls  tottime  percall  cumtime  percall
           372    0.956    0.003   50.363    0.135 
           148    0.007    0.000    4.011    0.027
-----------------------------------------------------------------
                       goPreviousLine

        ncalls  tottime  percall  cumtime  percall
            66    0.014    0.000   43.315    0.656 
            72    0.005    0.000   13.542    0.188
-----------------------------------------------------------------
                         goNextLine

        ncalls  tottime  percall  cumtime  percall
            86    0.014    0.000   39.971    0.465 
            72    0.005    0.000   12.087    0.168
-----------------------------------------------------------------
                  findPreviousCaretInOrder

        ncalls  tottime  percall  cumtime  percall
     6469/5001    0.314    0.000   12.537    0.003
       588/164    0.174    0.000    3.849    0.023
-----------------------------------------------------------------
                    findNextCaretInOrder

        ncalls  tottime  percall  cumtime  percall
    8688/26781    1.128    0.000   37.050    0.001 
       362/146    0.116    0.000    2.582    0.018 
-----------------------------------------------------------------
                        getExtents

        ncalls  tottime  percall  cumtime  percall
         32972   11.495    0.000   17.228    0.001
          1090    1.214    0.001    2.363    0.002
-----------------------------------------------------------------
                      updateBraille

        ncalls  tottime  percall  cumtime  percall
           154    0.111    0.001   22.425    0.146
           148    0.162    0.001    2.600    0.018

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     2. Tabbing through the Bugzilla Advanced Search Form
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

         3675201 function calls in 193.921 CPU seconds
          548400 function calls in  96.273 CPU seconds

-----------------------------------------------------------------
                         onFocus

        ncalls  tottime  percall  cumtime  percall
           115    0.006    0.000   75.807    0.659
           115    0.006    0.000   10.105    0.088

-----------------------------------------------------------------
                   locusofFocusChanged

        ncalls  tottime  percall  cumtime  percall
            58    0.005    0.000   75.714    1.305
            59    0.009    0.000    9.397    0.159
-----------------------------------------------------------------
                      guessTheLabel

        ncalls  tottime  percall  cumtime  percall
            35    0.007    0.000   25.423    0.726
            34    0.006    0.000    2.538    0.075

Comment 29 Willie Walker 2008-01-11 13:45:06 UTC

Woo woo!  You rock!  Thanks Joanie!

(In reply to comment #28)
> Out of curiosity, I ran some tests to compare where we are at the bleeding-edge
> moment with where we were before the line-related enhancements in Gecko.py. 
> With no further ado:
> 
>               Revision 3387 (pre-Gecko performance) 
>                                vs 
>         Current Orca plus latest patch to Bug 506360 (rev.4)
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>                  1. Reading the Wiki by Line
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
>          4166839 function calls in 201.149 CPU seconds
>           958015 function calls in 139.656 CPU seconds
> 
> -----------------------------------------------------------------
>                   getLineContentsAtOffset
> 
>         ncalls  tottime  percall  cumtime  percall
>            372    0.956    0.003   50.363    0.135 
>            148    0.007    0.000    4.011    0.027
> -----------------------------------------------------------------
>                        goPreviousLine
> 
>         ncalls  tottime  percall  cumtime  percall
>             66    0.014    0.000   43.315    0.656 
>             72    0.005    0.000   13.542    0.188
> -----------------------------------------------------------------
>                          goNextLine
> 
>         ncalls  tottime  percall  cumtime  percall
>             86    0.014    0.000   39.971    0.465 
>             72    0.005    0.000   12.087    0.168
> -----------------------------------------------------------------
>                   findPreviousCaretInOrder
> 
>         ncalls  tottime  percall  cumtime  percall
>      6469/5001    0.314    0.000   12.537    0.003
>        588/164    0.174    0.000    3.849    0.023
> -----------------------------------------------------------------
>                     findNextCaretInOrder
> 
>         ncalls  tottime  percall  cumtime  percall
>     8688/26781    1.128    0.000   37.050    0.001 
>        362/146    0.116    0.000    2.582    0.018 
> -----------------------------------------------------------------
>                         getExtents
> 
>         ncalls  tottime  percall  cumtime  percall
>          32972   11.495    0.000   17.228    0.001
>           1090    1.214    0.001    2.363    0.002
> -----------------------------------------------------------------
>                       updateBraille
> 
>         ncalls  tottime  percall  cumtime  percall
>            154    0.111    0.001   22.425    0.146
>            148    0.162    0.001    2.600    0.018
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>      2. Tabbing through the Bugzilla Advanced Search Form
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
>          3675201 function calls in 193.921 CPU seconds
>           548400 function calls in  96.273 CPU seconds
> 
> -----------------------------------------------------------------
>                          onFocus
> 
>         ncalls  tottime  percall  cumtime  percall
>            115    0.006    0.000   75.807    0.659
>            115    0.006    0.000   10.105    0.088
> 
> -----------------------------------------------------------------
>                    locusofFocusChanged
> 
>         ncalls  tottime  percall  cumtime  percall
>             58    0.005    0.000   75.714    1.305
>             59    0.009    0.000    9.397    0.159
> -----------------------------------------------------------------
>                       guessTheLabel
> 
>         ncalls  tottime  percall  cumtime  percall
>             35    0.007    0.000   25.423    0.726
>             34    0.006    0.000    2.538    0.075
>

Comment 30 Eitan Isaacson 2008-01-11 19:24:21 UTC

Lovely!

This reminds me that there is a further pyatspi change to make that would speed up event handling by utilizing borrowed references. I'll try to get it in fo the next release too.

Comment 31 Willie Walker 2008-01-16 17:33:26 UTC

(In reply to comment #30)
> Lovely!
> 
> This reminds me that there is a further pyatspi change to make that would speed
> up event handling by utilizing borrowed references. I'll try to get it in fo
> the next release too.
> 

Just a nudge to Eitan to remind him to look into this.  :-)

Comment 32 Eitan Isaacson 2008-01-18 09:11:45 UTC

(In reply to comment #31)
> Just a nudge to Eitan to remind him to look into this.  :-)
> 
Thanks for that, I needed it.
Anyway, a patch for pyatspi is in bug 502924.
No fancy pyorbit versions need for this one.
I can't see a difference for small Orca stuff, but it might make a difference in more intensive use cases.

Comment 33 Willie Walker 2009-06-23 02:47:08 UTC

Created attachment 137225 [details] [review]
Patch to cache displayedText and displayedLabel computation

This patch takes advantage of the script.generatorCache dictionary that is cleared before we process each new AT-SPI event.  The idea is that several code paths may ask for the same computation while processing a single event.  For example, speech and braille may want the same value.  With the generatorCache, we can cache values in script.generatorCache and re-use them.

Comment 34 Willie Walker 2009-06-23 02:52:41 UTC

Committed the last patch to master and the 570658 branch.  The patch tests well.  Pylint is crashing on my machine again, though, so I didn't pylint. :-(  BTW, this cuts the number of times the getDisplayedText and getDisplayedLabel logic is performed in more than halF: getDisplayedText is used in other areas of the code to perform other logic.  Hopefully, this will result in a perceivable difference by users.

Comment 35 Joanmarie Diggs (IRC: joanie) 2013-01-02 15:53:53 UTC

Major performance improvements have been made in Orca, the accessibility libraries, and toolkits implementing accessibility support. Work will continue, of course. But I no longer think we need a Performance "metabug".