Bug 529784 – Speech cannot always be interrupted with flat review

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 529784 - Speech cannot always be interrupted with flat review


Summary:	Speech cannot always be interrupted with flat review


Status:	RESOLVED FIXED

Product:	orca
Classification:	Applications
Component:	speech
Version:	2.21.x
Hardware:	Other All

Importance:	Normal major
Target Milestone:	2.24.0
Assigned To:	Willie Walker
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:	535493
Blocks:

Reported:	2008-04-24 21:29 UTC by Joanmarie Diggs (IRC: joanie)
Modified:	2009-03-10 00:04 UTC

See Also:
GNOME target:	---
GNOME version:	2.21/2.22

Attachments
revision 1 (2.37 KB, patch) 2008-04-24 21:35 UTC, Joanmarie Diggs (IRC: joanie)	none	Details \| Review
revision 2 (proof of concept) (4.46 KB, patch) 2008-04-29 23:13 UTC, Joanmarie Diggs (IRC: joanie)	none	Details \| Review
minor twek (4.39 KB, patch) 2008-04-30 22:13 UTC, Joanmarie Diggs (IRC: joanie)	rejected	Details \| Review
This attachment cannot be described briefly :-) (4.06 KB, patch) 2008-05-24 05:06 UTC, Joanmarie Diggs (IRC: joanie)	rejected	Details \| Review

Description Joanmarie Diggs (IRC: joanie) 2008-04-24 21:29:49 UTC

Steps to reproduce:

1. Locate a long line of text

2. Press KP_8 three times quickly to spell the line phonetically

3. Press Control almost immediately after to interrupt speech.

Expected results:  Speech would be interrupted.

Actual results:  Speech may or may not be interrupted and Orca resumes speaking/spelling the line regardless.

Comment 1 Joanmarie Diggs (IRC: joanie) 2008-04-24 21:35:20 UTC

Created attachment 109858 [details] [review]
revision 1

This patch adds a couple of speech.stops() for flat review. 

I also added a lastInputEventReleased to orca_state.py so that we only register clicks when the key has been released and re-pressed (or when we get events to that effect anyway).

Comment 2 Joanmarie Diggs (IRC: joanie) 2008-04-24 22:09:50 UTC

Kenny just reported via #orca that this patch works for him.  Yea!

speech.stop() is harmless, but I'm wondering about the addition to orca_state.py.  Please test.  Will please review.  Thanks!

Comment 3 Mike Pedersen 2008-04-25 17:45:10 UTC

Seems good pending Will's comments.

Comment 4 Willie Walker 2008-04-29 14:10:45 UTC

(In reply to comment #2)
> Kenny just reported via #orca that this patch works for him.  Yea!
> 
> speech.stop() is harmless, but I'm wondering about the addition to
> orca_state.py.  Please test.  Will please review.  Thanks!

Would you be able to look at lastInputEvent.type (with all the usual checks to make sure it's a KeyboardEvent) instead of adding a new field?

I'm also curious about just *why* this is solving the problem and what the root cause of the problem is in the first place.  If you put debug statements around the calls to gnome_speaker.say and gnome_speaker.stop in gnomespeechfactory.py, can you get an idea if either one is taking a long time to complete?  For example, record time.time() before the call and then do a debug output of time.time() - theRecordedTime after the call.

Comment 5 Joanmarie Diggs (IRC: joanie) 2008-04-29 23:13:43 UTC

Created attachment 110136 [details] [review]
revision 2 (proof of concept)

Will and I chatted about this some the other day via phone and some more today via IRC.  He thought perhaps doing the spell as a sayAll with a progress callback might solve it.  It seems to. <fingers crossed>

As I indicated in the description, this is the proof of concept version with:

* The addition of a __spellItemProgressCallback()

  Based on the __sayAllProgressCallback(), but I *think* all we
  care about is if we're still talking.  If the user is spelling
  an item out and interrupts that spelling, we don't want to move
  the caret.  Nor do we want to update the locusOfFocus.  (Right?)

* The addition of a spellingGenerator()
  Right now it does a little bit of extra "stuff" because it's
  creating a SayAllContext which is used by spellCurrentItem()
  and phoneticSpellCurrentItem() which, in turn, call sayAll().

  Question:  This seems to get the job done, and it seems simple,
  but it also seems hacky.  We're not doing a sayAll.  We don't
  need the starting and ending offsets.  I don't think we even
  need to know the accessible.  Therefore, do we want to add
  a method equivalent to sayAll() to speech.py and the speech
  generators specific to spelling, along with spellingContext?

* The speech.stops() from the first revision still in place.

  Without them, if you go to spell the current line, we seem to
  sometimes feel the need to say the whole line before spelling
  it.  At least with eSpeak.  Swift doesn't seem to suffer from
  this.

Like I said, seems to work so far.  Already pylinted.

Comment 6 Joanmarie Diggs (IRC: joanie) 2008-04-30 22:13:06 UTC

Created attachment 110194 [details] [review]
minor twek

Chatting with Will.  He suggested we just return in the callback.  Done.

Comment 7 Willie Walker 2008-05-01 13:48:18 UTC

(In reply to comment #6)
> Created an attachment (id=110194) [edit]
> minor twek
> 
> Chatting with Will.  He suggested we just return in the callback.  Done.

I think this looks good.  It also helps set us up for a model for how we might consider turning speakUtterances into something that could use the SayAll framework, but that's for another day.  I say commit if you're sure this tests/pylints well.  Thanks!

PS - If the problem this is fixing is also something that was introduced as a side effect of the fix for bug #440490, it seems like this should also be checked into the gnome-2-22 branch.

Comment 8 Joanmarie Diggs (IRC: joanie) 2008-05-01 14:21:28 UTC

Thanks Will.  It pylints.  Given the nature of the change, I'll need to run the full suite of regression tests (as opposed to just Gecko). In the meantime, I would certainly appreciate some testing from Mike.  :-)

Comment 9 Mike Pedersen 2008-05-07 22:38:11 UTC

This seems to work well for me.

Comment 10 Joanmarie Diggs (IRC: joanie) 2008-05-08 20:09:45 UTC

The only difference in the gtk-demo tests is the explicit speaking of the (localized) word "space."  Before we just sent the space character to be spoken.

I assume this difference is acceptable, but I figured I'd ask first. :-)

Test 54 of 87 FAILED: /home/jd/orca/test/keystrokes/gtk-demo/role_text_multiline_navigation.py:KP_8 2X to spell 'This is only'
DIFFERENCES FOUND:
  BRAILLE LINE:  'This is only  $l'
       VISIBLE:  'This is only  $l', cursor=6
  BRAILLE LINE:  'This is only  $l'
       VISIBLE:  'This is only  $l', cursor=6
  SPEECH OUTPUT: 'This is only 
  '
  SPEECH OUTPUT: 'T'
  SPEECH OUTPUT: 'h'
  SPEECH OUTPUT: 'i'
  SPEECH OUTPUT: 's'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'i'
  SPEECH OUTPUT: 's'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'o'
  SPEECH OUTPUT: 'n'
  SPEECH OUTPUT: 'l'
  SPEECH OUTPUT: 'y'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: '
  '

Test 55 of 87 FAILED: /home/jd/orca/test/keystrokes/gtk-demo/role_text_multiline_navigation.py:KP_8 3X to military spell 'This is only'
DIFFERENCES FOUND:
  BRAILLE LINE:  'This is only  $l'
       VISIBLE:  'This is only  $l', cursor=6
  BRAILLE LINE:  'This is only  $l'
       VISIBLE:  'This is only  $l', cursor=6
  BRAILLE LINE:  'This is only  $l'
       VISIBLE:  'This is only  $l', cursor=6
  SPEECH OUTPUT: 'This is only 
  '
  SPEECH OUTPUT: 'T'
  SPEECH OUTPUT: 'h'
  SPEECH OUTPUT: 'i'
  SPEECH OUTPUT: 's'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'i'
  SPEECH OUTPUT: 's'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'o'
  SPEECH OUTPUT: 'n'
  SPEECH OUTPUT: 'l'
  SPEECH OUTPUT: 'y'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: '
  '
  SPEECH OUTPUT: 'tango'
  SPEECH OUTPUT: 'hotel'
  SPEECH OUTPUT: 'india'
  SPEECH OUTPUT: 'sierra'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'india'
  SPEECH OUTPUT: 'sierra'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: 'oscar'
  SPEECH OUTPUT: 'november'
  SPEECH OUTPUT: 'lima'
  SPEECH OUTPUT: 'yankee'
- SPEECH OUTPUT: ' '
?                 ^

+ SPEECH OUTPUT: 'space'
?                 ^^^^^

  SPEECH OUTPUT: '
  '
[FAILURE WAS UNEXPECTED]

Comment 11 Willie Walker 2008-05-09 11:20:03 UTC

(In reply to comment #10)
> The only difference in the gtk-demo tests is the explicit speaking of the
> (localized) word "space."  Before we just sent the space character to be
> spoken.
> 
> I assume this difference is acceptable, but I figured I'd ask first. :-)

Speaking 'space' is probably a fix, even.  :-)

Comment 12 Joanmarie Diggs (IRC: joanie) 2008-05-10 20:16:14 UTC

Patch (and updated regression test) committed to trunk.

Will, I looked at doing the same for the 2-22 branch.  In doing so I noticed that the original spellCurrentItem() was using speakCharacter() -- in trunk it looks like we decided not to do that.  Were I to commit this change to branch, we'd no longer be doing speakCharacter().  So.... Do I:

1. Commit it as-is, removing speakCharacter()

2. Modify the patch to use speakCharacter()

3. Not commit to branch

?

Comment 13 Willie Walker 2008-05-10 21:02:24 UTC

2.22.2 is not due out until 26-May, and I'm also chatting with Kenny on IRC right now about this issue.  It turns out it may not be completely solved (Kenny is working on a test case to post to this bug).  So, how about we let this fester in trunk for a bit until we completely resolve the problem and then get things back into branch?
 
(In reply to comment #12)
> Patch (and updated regression test) committed to trunk.
> 
> Will, I looked at doing the same for the 2-22 branch.  In doing so I noticed
> that the original spellCurrentItem() was using speakCharacter() -- in trunk it
> looks like we decided not to do that.  Were I to commit this change to branch,
> we'd no longer be doing speakCharacter().  So.... Do I:
> 
> 1. Commit it as-is, removing speakCharacter()
> 
> 2. Modify the patch to use speakCharacter()
> 
> 3. Not commit to branch
> 
> ?
>

Comment 14 Kenny Hitt 2008-05-10 22:15:55 UTC

Since I can no longer produce the bug with the review keys, I have found the following test case.  Open http://bugzilla.gnome.org/show_bug.cgi?id=529784 or almost any page in Firefox that has a lot of text.  Hold down the down arrow for a second to scroll down the page.  Press left control to silence speech.
Expected behavior: Orca should stop speaking.
Actual behavior: ORca will continue to read the text on the page.  Neither control or orca-s to silence speech will stop the speech output.
Orca SVN revision 3893 gnome-speech 0.4.19 espeak 1.37 Firefox nightly build from 05/10/2008.

Comment 15 Joanmarie Diggs (IRC: joanie) 2008-05-24 05:06:01 UTC

Created attachment 111448 [details] [review]
This attachment cannot be described briefly :-)

Kid to parent: "Hey, what's that strange lady doing with all of those straws?"

Parent: "Shhhh, honey, she's grasping at them.  Don't stare."

Anyhoo.... I may be onto a solution, or not.  So.... Lemme 'splain:

If we send a lot of speech to espeak via speech.speak() in a loop, we can't seem to interrupt espeak.  I mean, we can, but then it just starts talking again.  Swift doesn't have this issue.  No, I don't know why yet.  Regardless, what solved the problem in this bug w.r.t. flat review is to nuke the for loop and use the SayAll approach (iterators, progress callbacks, etc.) instead.  So, I did that with the Gecko speakContents().  And it helped. Some.

The other problem is that when you hold down the Up or Down Arrow for an extended period of time, we are desperately trying to figure out the line contents.  That's gotten to be speedier over the past release or so, but it still is not fast enough. :-(  I'll keep working on line nav efficiency, but until we get reliable caret navigation implemented by the Firefox guys, there's only so much we can do....  So, here comes the straws bit:

Idea:  If you're holding down the arrow key or pressing it that freakin' fast, you're probably skimming and/or scrolling.  You at most need the first part of the line.  Eventually you'll stop and see where you are and make a decision.  Gecko has some built-in caret navigation that works. :-)  Therefore, if the user is holding/pressing the key super fast, let Gecko temporarily control the caret.  We'll keep track of the last time a caret navigation command was used that was meant for us.  Meant for us means:

1. We're using Orca's caret navigation rather than Gecko's

2. We're in document content

3. We're not in ARIA

4. We're not in one of those special things where we let Gecko handle it (e.g. entries, combo boxes, etc.)

If it's meant for us, we store the time.  If it's not meant for us, we'll clear the time.  But.... ***Important Part*** We won't handle this command which was meant for us to handle, if it occurred within a certain period of time following the last such command.  i.e. if you press down arrow twice in a row within, say, 1/3 of a second, we'll let Firefox have that second key press.

Problem:  What's the magical time?  Answer:  I dunno.  Based on experimenting with rapid scrolling, the 1/3 of a second seemed to accomplish what I believe we want.

Let's assume that this part of the solution is good and fine for now.  It introduces another problem:  Normally when we position the caret using Orca's caret navigation, Firefox doesn't emit caret-moved events.  However, when Firefox/Gecko is controlling the caret, it emits them like they're going out of style.  When Gecko controls the caret and we get caret-moved events like this, they get passed along to the default script, which ultimately passes it back to us to display in speech and in braille.  But the speech doesn't use speakContents in that case, and updating braille means it goes after the line contents which is the thing that's slowing us down in the first place.

Idea2: If we detect rapid fire caret moved events immediately following our setting of a "last caret navigation" time, we move the caret context to wherever Firefox tells us (via the caret moved event), we clear our line cache because we don't know where the heck we are, and we wait until the rapid moving is over. :-)

Problem2:  Again, what's the magical "immediately following" time?  Answer:  I dunno.  Based on experimenting with rapid scrolling, 2 seconds seemed to work.  I realize that sounds like a lot, but if you hold down the down arrow or whack on it a bunch quickly, we'll get the first couple of events sooner than 2 seconds.  But by the time we process those, a bunch more have shown up -- 3, 4, 5 seconds later in extreme cases.

So... Kenny.... If you wouldn't mind taking this patch for a test drive that would be great.  In terms of the magical numbers, if you look at the patch, I have added a variable called "timeDiff" in two places.  Those values are the aforementioned 1/3 second and 2 seconds.  If the patch doesn't quite work for you as you would like, see if altering those numbers a bit helps.  Also, this patch has been pylinted and tested just w.r.t. the problem you mentioned.  I have not regression tested it yet.  I don't know what side effects this solution might introduce.  If you find some, please let me know.  Thanks!!

Will, dare I ask... ;-) Thoughts on this? :-)

Comment 16 Kenny Hitt 2008-05-28 03:41:41 UTC

i can confirm Joanie's observation that this problem doesn't happen with Cepstral Swift.  The problem appears to be specific to espeak.  I can't test other synths.  Users of other synths will have to test and comment.  The severity of the problem with espeak seems to depend on it's output method.
Portaudio 19: problem exists in Firefox.  Problem does not exist using the first test case for this bug (flat review).
Portaudio 18: problem exists in Firefox.  Also exists in flat review.
It appears this problem might actually be an espeak bug, a problem with the espeak gnome-speech driver, or both.
Although the problem of not being able to stop speech occurs with Cepstral SWift, it is rare.  So far, I can't reproduce it consistantly with Swift.  When it does occur with Swift, it is the result of a command run in a gnome-terminal that produces a large amount of output.  However, rerunning the command doesn't always cause the same results.

Comment 17 Joanmarie Diggs (IRC: joanie) 2008-05-28 18:45:50 UTC

Changing the status of the patch which addressed the first issue raised in this bug to "rejected": That patch caused more problems than it fixed, and the bug in question is really an espeak bug.  Therefore, that patch was reversed (see http://bugzilla.gnome.org/show_bug.cgi?id=532982#c10).

Changing the status of the patch which addresses (sorta) the second issue raised in this bug to "rejected" because that patch admittedly sucks on a couple of different levels. :-)

Leaving Will's name on the summary because his wisdom is needed. Thanks!

Comment 18 Willie Walker 2008-05-28 20:29:25 UTC

Given that Kenny determined (In reply to comment #16)
> i can confirm Joanie's observation that this problem doesn't happen with
> Cepstral Swift.  The problem appears to be specific to espeak.

Given this, and the fact that some debugging yesterday seems to confirm that the gnome-speech driver for eSpeak is indeed blocking on calls to 'say'.  The problem is that espeak seems to have a buffer that fills up and espeak's gnome-speech driver 'say' method will enter a sleep/retry look until the eSpeak buffer empties enough.

So...I'm thinking maybe we move this bug over to gnome-speech.  Ideally, eSpeak would not have this buffer limitation, and we've engaged the eSpeak developers in a conversation.  Failing that, we could potentially try to do some sort of queuing mechanism in the gnome-speech driver for eSpeak, but that's likely to get hairy.

Thoughts?

Comment 19 Rich Burridge 2008-05-28 21:06:10 UTC

Rather than move it over, perhaps do what we do with bugs in
other applications and just open a new bug there, and block 
this one against it. Then Orca users only have to check here 
for potential problems that affect Orca.

Comment 20 Joanmarie Diggs (IRC: joanie) 2008-05-28 23:00:28 UTC

(In reply to comment #19)
> Rather than move it over, perhaps do what we do with bugs in
> other applications and just open a new bug there, and block 
> this one against it. Then Orca users only have to check here 
> for potential problems that affect Orca.

This makes sense to me.

Comment 21 Willie Walker 2008-05-29 13:51:27 UTC

Blocked on gnome-speech bug #535493.  Kenny, if you want to take a shot at modifying the gnome-speech driver for eSpeak, please let me know.

Comment 22 Joanmarie Diggs (IRC: joanie) 2008-05-30 17:53:26 UTC

Removing the 2.24 target from this bug because this is no longer in my hands.  It's up to those wacky gnome-speech guys now. ;-)

Comment 23 Kenny Hitt 2008-06-02 08:51:01 UTC

i decided to add this comment for information.  Until now, all tests I had done for this bug used gnome-speech.  I decided to test using speech-dispatcher with the direct espeak driver.  The problem doesn't appear with the speech-dispatcher driver.  This bug is definitely specific to the gnome-speech espeak driver.

Comment 24 Willie Walker 2008-07-08 19:15:07 UTC

The underlying gnome-speech driver for eSpeak has been modified to prevent blocking.  Closing as fixed.