GNOME Bugzilla – Bug 506360
find{Next,Previous}Line() should be more efficient
Last modified: 2008-07-22 19:33:10 UTC
Currently Gecko.py's findNextLine() and findPreviousLine() locate the next/previous line by moving character-by-character and checking extents. We can make this more efficient by only moving character-by-character when we run out of text in the current object.
Created attachment 101832 [details] [review] first pass This will undoubtedly need tweaking and correcting, but it *seems* to work. (I know, I know, famous last words). It's not a dramatic improvement, but it shaves some time off: ncalls tottime percall cumtime percall 67 1.239 0.018 13.738 0.205 findNextLine 67 0.024 0.000 3.704 0.055 altFindNextLine 61 1.177 0.019 15.048 0.247 findPreviousLine 68 0.075 0.001 6.858 0.101 altFindPreviousLine I'm afraid I made a separate setting for this is as well so that I could isolate it from the other performance enhancements. The setting for this one is useNewLineNav. Mike, dare I say it? Please test. <smile>
Should this patch be impacting the double reading of lines in any way? On the folowing site: http://www.techbargains.com/news_displayItem.cfm/107959?rss=1 if you arrow beyond the company coupon or expiration line and then arrow back up you will still hear that line twice. This patch does seem to improve performance a bit more.
Confirmed. I'm going to take a new stab at the alternative findPreviousLine(). But first let's get the form sluggishness bug stuff worked out and checked in. I made some changes in the patch to that bug that may make our life easier in this bug. I hope....
Created attachment 101932 [details] [review] revision 2 This seems to address the issue Mike pointed out and still works elsewhere. :-) Has been pylinted and manually tested somewhat, but needs additional testing, regression testing, etc. Mike please give this one a spin. Thanks! By the way, it feels faster when tabbing past combo boxes, but I'll run some more performance tests to find out for sure.
With up arrowing, I'm still seeing a reasonable improvement, especially in terms of reduced calls to findPreviousCaretInOrder(): ncalls tottime percall cumtime percall 67 0.006 0.000 22.884 0.342 goPreviousLine (old) 72 0.005 0.000 13.769 0.191 goPreviousLine (new) 5152/4625 0.877 0.000 10.459 0.002 findPreviousCaret..(old) 283/72 0.069 0.000 1.601 0.022 findPreviousCaret..(new) Sadly, the tabbing test proved things were only slightly improved. :-( ncalls tottime percall cumtime percall 57 0.005 0.000 23.279 0.408 locusOfFocusChanged (old) 58 0.005 0.000 15.801 0.272 locusOfFocusChanged (new) 112 0.005 0.000 24.567 0.219 onFocus (old) 115 0.005 0.000 16.968 0.148 onFocus (new) 34 0.010 0.000 4.690 0.138 guessTheLabel (old) 35 0.011 0.000 4.099 0.117 guessTheLabel (new) I'll see if I can make the above methods more performant. But I'm starting to run out of clever ideas, so other suggestions for improving performance are encouraged. :-) Thanks! Also, I'm curious about some of the pyatspi calls. While the percall times are quite good, we call some of these things a lot.... Can we avoid that? For instance: ncalls tottime percall cumtime percall 16296/16292 14.185 0.001 15.371 0.001 _inner 51587 4.507 0.000 9.270 0.000 _getAndCache
One small problem that I think might be related to this patch. 1. Open a bugzilla message in thunderbird. 2. arrow down to the bottom of the message. You should notice that you can only arrow up to the line containing the dashes which is the second line from the bottom of the message.
Created attachment 102565 [details] [review] Revision 3 You would not believe how many "special" or "different" cases there are when it comes to getting line contents and deciding whether or not something that's on this line spatially is on this line in terms of the content (for example two sections side-by-side placed via CSS). Well, I guess Will would since he went through this the first go around -- the rest of y'all have no idea. ;-) ;-) ;-) Anyhoo... This is pylinted and functionally tested quite a bit. The regression tests are a bit harder to use because this patch attempts to make some corrections to other related issues we currently have. But I'll work on that. Mike please start testing. Thanks! Performance numbers to follow.
The numbers aren't too bad, really. Finding the line is a little over twice as fast. And we spend far less time finding carets.... ncalls tottime percall cumtime percall 72 1.345 0.019 14.895 0.207 findNextLine OLD 72 0.006 0.000 6.092 0.085 findNextLine NEW 66 1.355 0.021 15.946 0.242 findPreviousLine OLD 72 0.008 0.000 7.602 0.106 findPreviousLine NEW 5072/4578 0.879 0.000 10.253 0.002 findPreviousCaret.. OLD 588/164 0.170 0.000 3.897 0.024 findPreviousCaret.. NEW 4907/4555 0.863 0.000 9.986 0.002 findNextCaret.. OLD 362/146 0.112 0.000 2.562 0.018 findNextCaret.. NEW I suspect I can get a bit more performance beyond the above by using stored lines with label guess. I'll look at doing that next.
Created attachment 102572 [details] [review] revision 4: Store the lines for label guess We tend to get the current line quite a bit. As part of my work for bug 500016, I cut down on that somewhat. This cuts down on it a bit further. A couple of sample tests: OLD == pre patches for this bug, but with the work for 500016 included NEW == the patch before this one :-) CACHE == this patch 1. Navigating the main page of our Wiki with Up/Down Arrow: ncalls tottime percall cumtime percall 72 1.345 0.019 14.895 0.207 findNextLine OLD 72 0.006 0.000 6.092 0.085 findNextLine NEW 72 0.001 0.000 4.489 0.062 findNextLine CACHE 66 1.355 0.021 15.946 0.242 findPreviousLine OLD 72 0.008 0.000 7.602 0.106 findPreviousLine NEW 72 0.004 0.000 6.094 0.085 findPreviousLine CACHE So we're shaving a bit more time off, with the line routines being 3 times faster now. 2. Tabbing through Bugzilla's Advanced Search page: ncalls tottime percall cumtime percall 34 0.009 0.000 4.104 0.121 guessTheLabel OLD 34 0.008 0.000 4.087 0.120 guessTheLabel NEW 34 0.006 0.000 2.538 0.075 guessTheLabel CACHE We weren't seeing any improvement in guessTheLabel with the new line routines pre caching. Storing the line contents makes guessTheLabel a bit faster (not quite 2 times faster, but it's still an improvement). Please test. I'll work on new regression tests tomorrow or Saturday.
Mike I had mentioned that I thought this patch might become "untrunk compatible" as a result of my patch to bug 508784. I was wrong. Seems to keep on working. <smile> Please test. Thanks.
I think this looks good and I say commit after Mike has verified it works well. Thanks Joanie!
One problem I'm finding with the latest patch that didn't exist with out it is the folowing: 1. go to www.woot.com 2. Press h to get to the heading for todays product. 3. Down arrow to the condition line. At this point you will hear "condition text" and will not be able to down arrow further.
We get stuck in many places down arrowing through this site. http://www.sirius.com/servlet/ContentServer?league=NFL&pagename=Sirius%252FPage&c=FlexContent&cid=1065475754246&team=all
I can confirm the woot issue. The above link doesn't give me any content. I'll work on some general defensive don't get stuck code which will hopefully solve both. Stay tuned.
Created attachment 102691 [details] [review] defensive measures added Mike, this does a defensive check so we don't get suck on woot now. Please test it on the other site where we were getting stuck as I cannot access it at all. Note that for some reason, we're not picking up the text "new" that we were getting stuck on. I'll look at that now as that seems to be a special case. Thanks!
Created attachment 102693 [details] [review] also solve the woot text issue This causes us to also pick up the "new" in condition. Please test.
Created attachment 102694 [details] [review] slight change to the defensive measure Okay, I really think this is right now. <grin> Please test.
The latest patch nicely fixes the problems in my previous comment. thanks much
Yea! Thanks Mike. Patch committed. Moving to pending.
Created attachment 102704 [details] [review] more defensive work Mike found another case on overstock.com where some stickage occurs. This patch should handle it. Please test.
This patch looks good to me. thanks much
Thanks Mike. Patch committed. Moving to pending (again). :-)
Here is a poorly behaved site that I think might be directly related to this bug/patch. https://buy.garmin.com/shop/shop.do?cID=134 Two different things happen here. 1. If I start arrowing down from the top of the page I can never arrow to the various checkboxes for the Nuvis. 2. If I tab down to the Nuvi checkboxes I can not arrow down but I can arrow up. I'll put Joanie's name back on this bug.
Confirmed. By the way, this enhancement comes with its own little setting: import orca.Gecko orca.Gecko.useNewLineNav = True Set it to False and you can access the stuff in question (which confirms that it's indeed this bug/patch). :-( I'll take a quick look to see if I can work it out now. Otherwise, I'll look at it this evening. Thanks!!
another test case for stuckage: www.bbc.co.uk/radio4/ when arrowing down, cant get beyond: "bbc radio4 - 92 to 94 FM and 198 long wave" ... although tabbing gets past it.
Thanks Jon! I need all of the examples I can get. I am going to take some time to update our existing regression tests and write new ones to try to minimize the likelihood of breaking things with future changes. <smile> But then I will look at your example.
Created attachment 103291 [details] [review] partial fix for the Garmin issue One of the problems on the Garmin site is that every time we get to "Automotive", we conclude that the start of this line was the top of the navigation links in the section on the left. As a result, we looped back. This patch fixes that one particular issue. It does *not* fix the issue with the checkboxes, nor does it fix the issue on the BBC Radio 4 site. Sorry! One bug at a time. <smile> Those issues are next on my list. Note: This patch has already been committed (tested, regression tested, and pylinted).
Created attachment 103299 [details] [review] another "tweak" Another issue on Garmin -- as well as one issue on the BBC Radio 4 site -- is addressed by this patch. (Pylinted, regression tested, and committed). There is still one other issue I see on the Garmin site, namely while we can arrow through the content without getting stuck, we're never moving by line to the checkboxes (you have to Tab to them). There are still two other issues on the BBC site: Another case where we just can't seem to arrow past some objects (the A-Z links) and one in which we wrap back to the beginning of a section (around the Message Boards/Have Your Say). At least I know what I'll be working on tomorrow evening. <grin> Stay tuned!
Created attachment 103362 [details] [review] get past the a-z links on the radio4 site The reason we can't get past the a-z links on the radio4 site is because it's an imagemap: An image whose children are the links we care about. Unfortunately, the image doesn't implement the text interface so we aren't seeing the embedded object characters when we try to getObjectsFromEOCs(). I IM'ed Aaron to see if this was a Firefox bug. He said it was an odd case, but not a bug because the image doesn't have text. Thus it's a special case, handled by this patch. This patch has been pylinted and regression tested and only makes a change to our new performant navigation code so I went ahead and checked it in. That's still not all of the radio4 issues, but I'm getting there.... <smile>
w.r.t. the radio4 site, sometimes we get hung up on images that are also links. This turns out to be an issue that existed prior to all the performance-related changes -- and something I've seen on occasion but couldn't always reproduce. Thanks to the BBC example, I was able to generate a test case which seems to be reliably reproducible. Thanks again Jon! I've opened bug #511118 on that issue as it's separate from this one.
I'm glad it was such a useful example :) Let me know when you wish something tested. Also for ease of verification, would it be any trouble to provide the patchId into svn commit message? That way one could just grep the log to see if that patch was committed. (although you are very good by stating that you have committed it after each case). Thanks for the excellent work!
sorry, i dont seem to have permissions to state that bug 509588 is a child of firefox metabug. in case people were looking on the metabug page.
more stuckage: reason: empty links <a href... ></a> as can be seen on: www.schwabb.com we cant seem to go beyond the find button. issue reprodusable using a basic html page, containing text and an empty link. Also note that on that particular case, we dont seem to see the text saying "Enter zip code" which is the pre-existing value of that textbox.
Since this bug is about performance/efficiency, and I'm pretty sure I've done what I can on that front for find{Next,Previous}Line(). I'm closing this bug as FIXED. Related (or not) issues with navigation should be filed as new bugs. I just filed bug 513321 for the schwabb.com issue. The one remaining issue on garmin.com (i.e. cannot arrow to the checkboxes) is bug 513323.