Bug 519271 – Regression tests need work

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 519271 - Regression tests need work


Summary:	Regression tests need work


Status:	RESOLVED FIXED

Product:	orca
Classification:	Applications
Component:	general
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	2.24.0
Assigned To:	Orca Maintainers
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:	519539 519541 519542 519543 519545 519546 519547 519549 519550 519553 519555 519556 519557 519558 519559 519560 519561 519562 519563 519564 519567 519568 519841 519849 520656 521651 523234 523235 523236 523237 523238 523438
Blocks:

Reported:	2008-02-28 13:17 UTC by Willie Walker
Modified:	2008-06-17 19:50 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
alternative mini multi-line test (3.89 KB, text/x-python) 2008-02-28 19:54 UTC, Joanmarie Diggs (IRC: joanie)	Details
foo.debug file generated by runone.sh (281.98 KB, text/plain) 2008-02-28 22:35 UTC, Rich Burridge	Details
jd's debug.out from the multiline navigation test (37.34 KB, application/x-gzip) 2008-02-29 00:34 UTC, Joanmarie Diggs (IRC: joanie)	Details
Results of running the role_text_multiline_navigation.py test again. (14.16 KB, text/plain) 2008-02-29 01:37 UTC, Rich Burridge	Details
gzipped version of role_text_multiline_navigation.debug (37.31 KB, application/gzip) 2008-02-29 01:39 UTC, Rich Burridge	Details
Latest results from running the gtk-demo regression tests. (20.31 KB, text/plain) 2008-03-18 01:28 UTC, Rich Burridge	Details
runall.sh results (63.73 KB, text/plain) 2008-03-19 16:27 UTC, Willie Walker	Details

Description Willie Walker 2008-02-28 13:17:23 UTC

See comment #19 of bug 517505: http://bugzilla.gnome.org/show_bug.cgi?id=517505#c19.

We must work to make sure our existing regression tests:

1) Can run across multiple platforms (at least Ubuntu and Solaris)
2) Do not produce unexpected failures
3) Provide us with needed regression testing
4) Provide good coverage of the code

Starting with the unexpected failures in bug 517505 would be a good thing.  We should not be at all forgiving of unexpected failures, so I'm considering changing the word "UNEXPECTED" to something like "YOU CANNOT COMMIT UNTIL THIS IS RESOLVED".

Comment 1 Joanmarie Diggs (IRC: joanie) 2008-02-28 19:54:20 UTC

Created attachment 106189 [details]
alternative mini multi-line test

To try to keep all of the "unexpected failures" conversation in one place, the executive summary from bug 517505 is this:

* There were 8 tests with unexpected failures

* For 5 of the 8 tests, Rich and my output matched identically

* For 1 of the 8 tests, the difference was the number of page tabs
  (test expected 2, Rich had 4, I had 5).  I suspect that this is
  the result of Will having a different dialog, and Rich and I having
  a different printer setup.

* For 1 of the 8 tests, role_alert.py, I did not have failures. Rich
  has since identified the source of the difference:

  > The fix? Running Preferences->Windows and making sure that the
  > "Select windows when the mouse moves over them" checkbox wasn't
  > checked.

* For the final test, Rich and my output was identical for 4 of the
  5 unexpected failures.  Rich had one additional failure that I
  did not:

  -------------------------
  EXPECTED:
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
  ACTUAL:
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "SPEECH OUTPUT: 'This'",
       "SPEECH OUTPUT: 'selected'",
  -------------------------

  I agree with Rich:

  > I believe we are now at just the differences between Will's
  > Solaris GNOME 2.20 + latest AT system vs Rich/Joanie's almost
  > GNOME 2.22 Ubuntu Hardy system's (with printers).

  Attached is a mini test which might be helpful in identifying
  the remaining difference between Rich and my results without
  having to run a 99-step test repeatedly.

Comment 2 Rich Burridge 2008-02-28 21:03:45 UTC

Cor lumme! (English for Yikes!). If I turn off mouse-follows-pointer
and make sure pidgin, firefox and thunderbird are running, then I get
the following results:

$ ./runone.sh /home/richb/Desktop/foo.py gtk-demo 0
starting test application gtk-demo ...
Test 1 of 5 SUCCEEDED: /home/richb/Desktop/foo.py:KP_5 to flat review 'is'
Test 2 of 5 SUCCEEDED: /home/richb/Desktop/foo.py:KP_6 to flat review 'only'
Test 3 of 5 SUCCEEDED: /home/richb/Desktop/foo.py:KP_3 to flat review 'n'
Test 4 of 5 SUCCEEDED: /home/richb/Desktop/foo.py:KP_3 to flat review 'l'
Test 5 of 5 SUCCEEDED: /home/richb/Desktop/foo.py:KP_Divide to left click on 'l'
SUMMARY: 5 SUCCEEDED and 0 FAILED (0 UNEXPECTED) of 5 for /home/richb/Desktop/foo.py
/usr/bin/orca: line 92: 20984 Killed                  /usr/bin/python -c "import orca.orca; orca.orca.main()" "$ARGS"
./runone.sh: line 187: 20991 Killed                  $APP_NAME $ARGS $PARAMS
$

Joanie, is that what you were expecting (hoping for)?

Comment 3 Joanmarie Diggs (IRC: joanie) 2008-02-28 21:18:52 UTC

(In reply to comment #2)
> Cor lumme! (English for Yikes!).

Dang! (Texan for Cor lumme!) ;-)

> Joanie, is that what you were expecting (hoping for)?

Heh.  To be honest, no.  I was hoping that the last test would fail (because that's our last outstanding difference).  Then we could compare the resulting debug.outs and hopefully be difference-free.

But you said something interesting:

> If I turn off mouse-follows-pointer

Any chance you had magnification on when you ran the tests for bug 517505?  And/or that you were using your preferences as opposed to the harness'? (i.e. why did your test run manage to click on the word 'This'?)

Comment 4 Rich Burridge 2008-02-28 22:35:51 UTC

Created attachment 106198 [details]
foo.debug file generated by runone.sh

> Any chance you had magnification on when you ran the tests for bug 517505? 

No.

> And/or that you were using your preferences as opposed to the harness'? (i.e.
> why did your test run manage to click on the word 'This'?)

I thought runone.sh used it's own preferences? I literally do nothing
whilst these tests are running.

Do you mean the "This" speech output at line 5566?

Comment 5 Joanmarie Diggs (IRC: joanie) 2008-02-28 23:49:20 UTC

> I thought runone.sh used it's own preferences? 

Oh yeah, sorry.  My brain's wires got crossed.  (I've launched Orca from within that directory and it used the test preferences rather than mine)  Never mind.

> Do you mean the "This" speech output at line 5566?

No.  Orca says "This" at line 5566 because of using Control+Right to move past the word "this."

In the original test (role_text_multiline_navigation.py), a bunch of stuff happens.  Then we flat review text:

* is
* only
* n
* l

Then we do a left click on the l:

-------------------
sequence.append(utils.StartRecordingAction())
sequence.append(KeyComboAction("KP_Divide"))
sequence.append(WaitAction("object:text-caret-moved",
                           None,
                           None,
                           pyatspi.ROLE_TEXT,
                           5000))
sequence.append(utils.AssertPresentationAction(
    "KP_Divide to left click on 'l'",
    ["BRAILLE LINE:  'This is only  $l'",
     "     VISIBLE:  'This is only  $l', cursor=11",
     "BRAILLE LINE:  'This is only  $l'",
     "     VISIBLE:  'This is only  $l', cursor=11"]))

-------------------

When I do this Orca clicks just to the left of the flat review rectangle and the cursor winds up being just to the left of the "l" that was under review. Maybe we should say something when this occurs; maybe we shouldn't.  But for the purposes of regression testing and this bug, the expected behavior is that we don't say anything; we just indicate our position in braille.  

When you do this test on your machine you get:

       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "BRAILLE LINE:  'This is only  $l'",
       "     VISIBLE:  'This is only  $l', cursor=11",
       "SPEECH OUTPUT: 'This'",
       "SPEECH OUTPUT: 'selected'",

So the cursor is in the right/expected spot (11).  But Orca tacks on a "This selected" for good measure.  Question is, why?  My theory (stab in the dark really) was that perhaps flat review had somehow clicked (double-clicked) on this and caused the word to become selected.

<shrugs>

Tell ya what, I'll re-run just the role_text_multiline_navigation.py.  If you could do the same, we can then compare what is taking place around the left click.  Make sense?

Comment 6 Willie Walker 2008-02-29 00:09:23 UTC

I've run the tests several times using Orca from trunk (svnversion 3652) on my Solaris Expression Community Edition build 79b machine with atk/gail/atspi also from trunk.  I'm seeing these unexpected failures.  I haven't dug into them, but I wanted to make sure I recorded them here in the event the e-mail storm of the century that's been hitting my inbox doesn't subside:

Test 95 of 99 FAILED: /export/home/wwalker/work/orca/trunk/te
st/keystrokes/gtk-demo/role_text_multiline_navigation.py:Page down
EXPECTED:
     "BRAILLE LINE:  'I'm just going to keep on typing. $l'",
     "     VISIBLE:  'I'm just going to keep on typing', cursor=1",
     "SPEECH OUTPUT: 'I'm just going to keep on typing.'",
ACTUAL:
     "BRAILLE LINE:  'Then, I'm going to type some $l'",
     "     VISIBLE:  'Then, I'm going to type some $l', cursor=1",
     "SPEECH OUTPUT: 'Then, I'm going to type some'",
[FAILURE WAS UNEXPECTED]
Test 97 of 99 FAILED: /export/home/wwalker/work/orca/trunk/test/keystrokes/gtk-d
emo/role_text_multiline_navigation.py:Shift+Page_Down to deselect text
EXPECTED:
     "BRAILLE LINE:  'I'm just going to keep on typing. $l'",
     "     VISIBLE:  'I'm just going to keep on typing', cursor=1",
     "SPEECH OUTPUT: 'I'm just going to keep on typing.'",
     "SPEECH OUTPUT: 'page selected from cursor position'",
ACTUAL:
     "BRAILLE LINE:  'Then, I'm going to type some $l'",
     "     VISIBLE:  'Then, I'm going to type some $l', cursor=1",
     "SPEECH OUTPUT: 'Then, I'm going to type some'",
     "SPEECH OUTPUT: 'page selected from cursor position'",
[FAILURE WAS UNEXPECTED]

Comment 7 Joanmarie Diggs (IRC: joanie) 2008-02-29 00:25:06 UTC

Ironically, Rich and I don't have failures there. 

Test 95 of 99 SUCCEEDED: /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_text_multiline_navigation.py:Page down
Test 97 of 99 SUCCEEDED: /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_text_multiline_navigation.py:Shift+Page_Down to deselect text

I'm in the process of setting up my secondary laptop with SNV_82 so I can use it to run tests without tying up a needed box. I'll see what I can repro there.

Comment 8 Joanmarie Diggs (IRC: joanie) 2008-02-29 00:34:01 UTC

Created attachment 106207 [details]
jd's debug.out from the multiline navigation test

Here's mine.  The keypress that seems to be causing the problem on Rich's box is the '/' at line 27239 in my output.  And, for whatever it's worth, this run matched all of my previous runs.

Comment 9 Rich Burridge 2008-02-29 01:37:44 UTC

Created attachment 106211 [details]
Results of running the role_text_multiline_navigation.py test again.

This time with "mouse follows pointer" turned off.

Just tests 89, 90, 91 and 92 failed.

I'll attach the .debug in a moment.

Comment 10 Rich Burridge 2008-02-29 01:39:51 UTC

Created attachment 106212 [details]
gzipped version of role_text_multiline_navigation.debug

Comment 11 Joanmarie Diggs (IRC: joanie) 2008-02-29 01:58:05 UTC

Rich, that means we match! No more diffs.  Yea!  And thanks.

Where/how did you turn off that setting? (i.e. if we're using the testing profile rather than our own settings, we shouldn't have differences like that right?)

Now to figure out why Will's results don't match ours....

Comment 12 Rich Burridge 2008-02-29 02:16:14 UTC

> Where/how did you turn off that setting? (i.e. if we're using the testing
> profile rather than our own settings, we shouldn't have differences like that
> right?)

I didn't do anything different tonight except made sure I wasn't
doing "focus follows mouse pointer". As doing that fixed up up
the other two failures, I wanted to see what effect it had on this one.

Comment 13 Joanmarie Diggs (IRC: joanie) 2008-02-29 02:30:02 UTC

Oh, gotcha.  (I'm not having a "smart day" today).

Regardless.... It would seem that the gtk-demo regression tests are reliably reproducible among multiple machines in Ubuntu Hardy.

Comment 14 Willie Walker 2008-02-29 02:30:50 UTC

The property is a gconf property:

gconftool-2 --set /apps/metacity/general/focus_mode --type string mouse
gconftool-2 --set /apps/metacity/general/focus_mode --type string click

The window focus mode indicates how windows are activated. It has three possible values; "click" means windows must be clicked in order to focus them, "sloppy" means windows are focused when the mouse enters the window, and "mouse" means windows are focused when the mouse enters the window and unfocused when the mouse leaves the window.

Comment 15 Rich Burridge 2008-02-29 04:07:37 UTC

> gconftool-2 --set /apps/metacity/general/focus_mode --type string mouse
> gconftool-2 --set /apps/metacity/general/focus_mode --type string click

Then I would suggest that the test harness be modified to --get the
focus_mode resource before running the tests and save its value.
Then call:

gconftool-2 --set /apps/metacity/general/focus_mode --type string click

then restore the saved value at the end.

That should reduce the number of test failures (certainly at least four
in my case).

We probably should start new bugs for each of these problems and make
this a sort of meta-bug for the whole regression test thing.

Comment 16 Willie Walker 2008-02-29 14:44:00 UTC

(In reply to comment #15)
> > gconftool-2 --set /apps/metacity/general/focus_mode --type string mouse
> > gconftool-2 --set /apps/metacity/general/focus_mode --type string click
> 
> Then I would suggest that the test harness be modified to --get the
> focus_mode resource before running the tests and save its value.
> Then call:
> 
> gconftool-2 --set /apps/metacity/general/focus_mode --type string click
> 
> then restore the saved value at the end.

This is of course failure prone (i.e., what if I Ctrl+C out of a test?).  In addition, there are bound to be other user-specific setting that can cause inadvertent differences in the  test results.

I think the best thing to do might be to set up a test user that has a controlled environment for testing.  With the recent changes that allow keyboard modifiers to work better with synthesized AT-SPI events in a VNC session, we might be pretty close to this.  The benefit of that should also be that your machine will not be tied up while you are running the tests.

Since you suggested the set/reset scheme, and you are the one who uses a non-default pointer/focus scheme, however, if the set/reset scheme works for you, I say go for it.  :-)

Comment 17 Rich Burridge 2008-02-29 15:15:31 UTC

> Rich, that means we match! No more diffs.

Great!

If my results are now the same as yours, then we have the following
"unexpected" failures:

SUMMARY: 0 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 1 for /home/jd/orca/test/keystrokes/gtk-demo/debug_commands.py
SUMMARY: 6 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 10 for /home/jd/orca/test/keystrokes/gtk-demo/role_column_header.py
SUMMARY: 0 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 4 for /home/jd/orca/test/keystrokes/gtk-demo/role_combo_box2.py
SUMMARY: 11 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 15 for /home/jd/orca/test/keystrokes/gtk-demo/role_combo_box.py
SUMMARY: 3 SUCCEEDED and 4 FAILED (0 UNEXPECTED) of 7 for /home/jd/orca/test/keystrokes/gtk-demo/role_icon.py
SUMMARY: 4 SUCCEEDED and 5 FAILED (0 UNEXPECTED) of 9 for /home/jd/orca/test/keystrokes/gtk-demo/role_label.py
SUMMARY: 2 SUCCEEDED and 2 FAILED (2 UNEXPECTED) of 4 for /home/jd/orca/test/keystrokes/gtk-demo/role_page_tab.py
SUMMARY: 3 SUCCEEDED and 2 FAILED (0 UNEXPECTED) of 5 for /home/jd/orca/test/keystrokes/gtk-demo/role_radio_button.py
SUMMARY: 3 SUCCEEDED and 4 FAILED (0 UNEXPECTED) of 7 for /home/jd/orca/test/keystrokes/gtk-demo/role_spin_button.py
SUMMARY: 0 SUCCEEDED and 4 FAILED (0 UNEXPECTED) of 4 for /home/jd/orca/test/keystrokes/gtk-demo/role_split_pane.py
SUMMARY: 1 SUCCEEDED and 6 FAILED (6 UNEXPECTED) of 7 for /home/jd/orca/test/keystrokes/gtk-demo/role_table.py
SUMMARY: 95 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 99 for /home/jd/orca/test/keystrokes/gtk-demo/role_text_multiline_navigation.py
SUMMARY: 11 SUCCEEDED and 3 FAILED (0 UNEXPECTED) of 14 for /home/jd/orca/test/keystrokes/gtk-demo/role_tree_table.py

But not all of these are UNEXPECTED. If I do:

  $ grep "BUG?" jd-patched.out | wc -l
  49

This suggests to me that 47 of the 49 tests were expected to fail because of
of bugs. Now it might be that these bugs have been fixed and the regression
test simply hasn't been updated, or that the bug still exists.

I plan to open up new bugs against each of these "BUG?" tests so they
can be investigated in more detail. They will all block this bug.

More than likely there will be duplicates, but we can close them as such as
we find them.

We can then divide them up and start working on them.

Comment 18 Rich Burridge 2008-02-29 16:22:03 UTC

Okay, all "BUG?" bugs filed. The number of BUG?'s is actually misleading. 
Lots of them are in the regression test files but not part of a test 
assertion.

Note that there are there more other UNEXPECTED failures that are known BUG?'s.

We'll get to them after evaluating the BUG? ones first.

Comment 19 Joanmarie Diggs (IRC: joanie) 2008-03-01 19:51:03 UTC

(In reply to comment #16)
> I think the best thing to do might be to set up a test user that has a
> controlled environment for testing.  

We should define what this environment is then.  For instance, these days a standard Ubuntu install uses compiz rather than metacity.  I personally switch to metacity, but if I didn't -- and/or if Rich didn't -- it's another case where we'd see differences.  Two that immediately spring to mind:

* Alt+Tabbing: We don't speak the name of the window in the switcher
  due to a compiz bug.

* Shift+F10 in compiz is bound to something else and thus any test
  that relies upon it to bring up a "right click" menu would fail.

Comment 20 Willie Walker 2008-03-06 22:52:52 UTC

OK - rule #1: compiz out, metacity in.

(In reply to comment #19)
> (In reply to comment #16)
> > I think the best thing to do might be to set up a test user that has a
> > controlled environment for testing.  
> 
> We should define what this environment is then.  For instance, these days a
> standard Ubuntu install uses compiz rather than metacity.  I personally switch
> to metacity, but if I didn't -- and/or if Rich didn't -- it's another case
> where we'd see differences.  Two that immediately spring to mind:
> 
> * Alt+Tabbing: We don't speak the name of the window in the switcher
>   due to a compiz bug.
> 
> * Shift+F10 in compiz is bound to something else and thus any test
>   that relies upon it to bring up a "right click" menu would fail.
>

Comment 21 Willie Walker 2008-03-11 14:06:17 UTC

First coarse pass at GNOME 2.24 planning.

Comment 22 Rich Burridge 2008-03-18 01:28:06 UTC

Created attachment 107503 [details]
Latest results from running the gtk-demo regression tests.

I've just run all the gtk-demo regression tests again against Orca SVN trunk.

Here is a summary of the ones that are currently failing:

SUMMARY: 0 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 1 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/debug_commands.py
SUMMARY: 6 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 10 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_column_header.py
SUMMARY: 11 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 15 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_combo_box.py
SUMMARY: 8 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 9 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_label.py
SUMMARY: 3 SUCCEEDED and 2 FAILED (0 UNEXPECTED) of 5 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_radio_button.py
SUMMARY: 5 SUCCEEDED and 2 FAILED (1 UNEXPECTED) of 7 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_spin_button.py
SUMMARY: 1 SUCCEEDED and 6 FAILED (6 UNEXPECTED) of 7 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_table.py
SUMMARY: 95 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 99 for /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_text_multiline_navigation.py

I've attached a more detailed set of results.

I haven't fully evaluated them yet.

Comment 23 Joanmarie Diggs (IRC: joanie) 2008-03-18 03:38:01 UTC

I just ran them on my primary Hardy box (fully updated, plus gtk+ from trunk).

-----------------------

> SUMMARY: 0 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 1 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/debug_commands.py

Confirmed, although the version number I have differs from both the expected and from yours.  (I have 2.13.1)

-----------------------

> SUMMARY: 6 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 10 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_column_header.py

NOT confirmed:

SUMMARY: 10 SUCCEEDED and 0 FAILED (0 UNEXPECTED) of 10 for /home/jd/orca/test/keystrokes/gtk-demo/role_column_header.py

I wonder if this is related to the changes that were made in Gail w.r.t. selection state. (I'm not convinced those made it into the separate Gail.)

-----------------------

> SUMMARY: 11 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 15 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_combo_box.py

> SUMMARY: 8 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 9 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_label.py

> SUMMARY: 5 SUCCEEDED and 2 FAILED (1 UNEXPECTED) of 7 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_spin_button.py

> SUMMARY: 95 SUCCEEDED and 4 FAILED (4 UNEXPECTED) of 99 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_text_multiline_navigation.py

Confirmed.

-----------------------

> SUMMARY: 3 SUCCEEDED and 2 FAILED (0 UNEXPECTED) of 5 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_radio_button.py

SUMMARY: 0 SUCCEEDED and 5 FAILED (3 UNEXPECTED) of 5 for /home/jd/orca/test/keystrokes/gtk-demo/role_radio_button.py

Confirmed, but I have additional errors.  Failure seems to be the result of having installed a printer and having a different dialog.

-----------------------

> SUMMARY: 1 SUCCEEDED and 6 FAILED (6 UNEXPECTED) of 7 for
> /home/richb/gnome/orca/trunk/test/keystrokes/gtk-demo/role_table.py

NOT confirmed.

SUMMARY: 7 SUCCEEDED and 0 FAILED (0 UNEXPECTED) of 7 for /home/jd/orca/test/keystrokes/gtk-demo/role_table.py

Couldn't tell ya....

-----------------------


In addition I have these failures:

-----------------------

SUMMARY: 4 SUCCEEDED and 3 FAILED (3 UNEXPECTED) of 7 for /home/jd/orca/test/keystrokes/gtk-demo/role_icon.py

(Failure is the result of item counts that don't match and will presumably be handled by the AAAAA fix)

-----------------------

SUMMARY: 1 SUCCEEDED and 3 FAILED (3 UNEXPECTED) of 4 for /home/jd/orca/test/keystrokes/gtk-demo/role_page_tab.py

(Failure is the result of my installing a printer.)

-----------------------

Comment 24 Rich Burridge 2008-03-18 20:18:44 UTC

I've added in new bugs for the regression test failures that haven't
already been dealt with. These are:

Bug #523234 - gtk-demo/debug_commands.py regression test #1 produces the wrong results.
Bug #523235 - gtk-demo/role_column_header.py regression tests #3, #4, #7 and #8 produce the wrong results.
Bug #523236 - gtk-demo/role_combo_box.py regression tests #12, #13, #14 and #15 produce the wrong results.
Bug #523237 - gtk-demo/role_table.py regression tests 1, 2, 3, 4, 6, and 7 produce the wrong results.
Bug #523238 - gtk-demo role_text_multiline_navigation.py regression tests 89, 90, 91 and 93 produce the wrong results.

I'll now add Joanie's comments from comment #23 above to each of those bugs.

Comment 25 Willie Walker 2008-03-19 16:27:37 UTC

Created attachment 107628 [details]
runall.sh results

Just as a point of reference, I ran the gtk-demo regression tests on SXDE 01/08 with atk/gail/at-spi/orca from trunk.  Here's a summary of the failures I'm observing in the attached runall.out file (I manually added some blank lines to make it more readable):

bash-3.2$ egrep "SUMMARY|FAILED" runall.out | grep -v "0 FAILED"

Test 1 of 1 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/debug_commands.py:Report script information
SUMMARY: 0 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 1 for /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/debug_commands.py

Test 2 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_icon.py:Layered pane Where Am I
Test 4 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_icon.py:bin icon Where Am I
Test 7 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_icon.py:icon selection Where Am I
SUMMARY: 4 SUCCEEDED and 3 FAILED (3 UNEXPECTED) of 7 for /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_icon.py

Test 5 of 9 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_label.py:This message box label caret select 'his' of 'This'
SUMMARY: 8 SUCCEEDED and 1 FAILED (1 UNEXPECTED) of 9 for /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_label.py

Test 3 of 5 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_radio_button.py:Range radio button
Test 5 of 5 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_radio_button.py:All radio button
SUMMARY: 3 SUCCEEDED and 2 FAILED (0 UNEXPECTED) of 5 for /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_radio_button.py

Test 1 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_spin_button.py:Hue spin button
Test 4 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_spin_button.py:Hue spin button decrement value
SUMMARY: 5 SUCCEEDED and 2 FAILED (1 UNEXPECTED) of 7 for /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_spin_button.py

Test 1 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_table.py:Table initial focus
Test 2 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_table.py:Table Where Am I
Test 3 of 7 FAILED: /export/home/wwalker/orca/trunk/test/keystrokes/gtk-demo/role_table.py:Table down one line
SUMMARY: 4 SUCCEEDED and 3 FAILED (3 UNEXPECTED) of 7 for /export/home/wwalker
/orca/trunk/test/keystrokes/gtk-demo/role_table.py

Comment 26 Joanmarie Diggs (IRC: joanie) 2008-03-19 22:23:00 UTC

Trying to keep the "rules" in one place. :-)  Here's the current list.

Rule #1: compiz out, metacity in:
comment #20
 
Rule #2: Use the latest OOo:
http://bugzilla.gnome.org/show_bug.cgi?id=521651#c4

Rule #3: You need a jre installed with OOo:
http://bugzilla.gnome.org/show_bug.cgi?id=521651#c11

Comment 27 Willie Walker 2008-03-25 14:36:07 UTC

(In reply to comment #26)
> Trying to keep the "rules" in one place. :-)  Here's the current list.
> 
> Rule #1: compiz out, metacity in:
> comment #20
> 
> Rule #2: Use the latest OOo:
> http://bugzilla.gnome.org/show_bug.cgi?id=521651#c4
> 
> Rule #3: You need a jre installed with OOo:
> http://bugzilla.gnome.org/show_bug.cgi?id=521651#c11
> 

It would probably be best to keep these under the "Set up an 'orca' Test Account" section of the WIKI.  I've made some updates here:  http://live.gnome.org/Orca/RegressionTesting