Bug 400716 – [requirement] sayAll should be done by sentences.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 400716 - [requirement] sayAll should be done by sentences.


Summary:	[requirement] sayAll should be done by sentences.


Status:	RESOLVED FIXED

Product:	orca
Classification:	Applications
Component:	general
Version:	2.17.x
Hardware:	Other All

Importance:	Normal enhancement
Target Milestone:	2.20.0
Assigned To:	Rich Burridge
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2007-01-25 18:30 UTC by Rich Burridge
Modified:	2008-07-22 19:27 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
The sample text file I've been using to test this. (2.38 KB, text/plain) 2007-04-04 16:46 UTC, Rich Burridge		Details
OpenOffice Writer sample. (20.92 KB, application/vnd.oasis.opendocument.text) 2007-04-04 16:54 UTC, Rich Burridge		Details
Simple sample test web page. (2.51 KB, text/html) 2007-04-04 17:15 UTC, Rich Burridge		Details
Evolution mail folder with simple sample message in it. (3.83 KB, text/plain) 2007-04-04 17:23 UTC, Rich Burridge		Details
Patch to implement this feature for gedit and OOo Writer. (1.79 KB, patch) 2007-04-04 17:28 UTC, Rich Burridge	none	Details \| Review
Patch to implement this feature for gedit, OOo Writer and now for Evolution. (7.03 KB, patch) 2007-04-05 17:40 UTC, Rich Burridge	none	Details \| Review
Revised patch to work better with gedit. (11.39 KB, patch) 2007-04-06 00:51 UTC, Rich Burridge	none	Details \| Review
Revised patch to fix an Evolution problem. (11.98 KB, patch) 2007-04-06 21:06 UTC, Rich Burridge	committed	Details \| Review
proposal to handle some breakage that occurred as a result of using sentence boundaries (943 bytes, patch) 2007-04-07 17:37 UTC, Joanmarie Diggs (IRC: joanie)	committed	Details \| Review
Patch to do the following. (10.72 KB, patch) 2007-04-16 18:22 UTC, Rich Burridge	none	Details \| Review
Orca debug output from doing a say all by sentence with the sample.txt (4.08 KB, text/plain) 2007-04-17 14:53 UTC, Rich Burridge		Details
Debug output from Orca when doing a say all by sentence in Evolution on the sample test message. (4.78 KB, text/plain) 2007-04-17 14:55 UTC, Rich Burridge		Details
New version of the say all patch. (11.98 KB, patch) 2007-04-17 14:59 UTC, Rich Burridge	none	Details \| Review
Revised version of patch. (11.51 KB, patch) 2007-04-17 15:34 UTC, Rich Burridge	committed	Details \| Review
Test file from Halim, where he says it doesn't stop in the right place. (73.14 KB, text/plain) 2007-04-27 13:29 UTC, Rich Burridge		Details
Fix for the Evolution problem reported in comment #48. (1.37 KB, patch) 2007-05-10 16:54 UTC, Rich Burridge	committed	Details \| Review
Hopefully a fix for latest Evolution sayAll traceback. (3.51 KB, patch) 2007-05-11 15:04 UTC, Rich Burridge	committed	Details \| Review
Patch to fix the problem reported in comment #54 (1.14 KB, patch) 2007-05-11 17:55 UTC, Rich Burridge	committed	Details \| Review
Various Orca print debug statement to help track down the interrupt problem. (3.89 KB, text/plain) 2007-05-14 15:55 UTC, Rich Burridge		Details
Orca run time output when debugging the interrupt problem. (6.77 KB, text/plain) 2007-05-14 15:57 UTC, Rich Burridge		Details

Description Rich Burridge 2007-01-25 18:30:14 UTC

As currently described in section 3.20 Document Reading in the Orca
spec.

Comment 1 Rich Burridge 2007-04-04 04:21:43 UTC

I've just looked in my copy of "Spoken Language Processing" by 
Huang, Acero and Hon.

There is a simple sentence breaking algorithm in section 14.3.4:

1. If found punctuation ./!/? advance one character and goto 2.
   else advance one character and goto 1.
2. If not found whitespace advance one character and goto 1.
3. If the character is period (.) goto 4.
   else goto 5.
4. Perform abbreviation analysis.
   If not an abbreviation goto 5.
   else advance one character and goto 1.
5. Declare a sentence boundary and sentence type ./!/?
   Advance one character and goto 1.

So there's something to start from.

Comment 2 Rich Burridge 2007-04-04 04:30:14 UTC

Abbreviation analysis should try to determine the following:

Title -- Dr., MD, Mr., Ms., St. (Saint), ... etc.
Measure -- ft., in., mm, cm (centimeter), kg (kilogram), ... etc
Place names -- CO, LA, CA, DC, USA, St. (street), Dr. (drive), ... etc

Oh boy, and that's just in English (um, American). Localizing that's going
to be fun. I guess we are going to need another dictionary here.

Comment 3 Rich Burridge 2007-04-04 16:46:14 UTC

Created attachment 85810 [details]
The sample text file I've been using to test this.

Comment 4 Rich Burridge 2007-04-04 16:54:42 UTC

Created attachment 85811 [details]
OpenOffice Writer sample.

Comment 5 Rich Burridge 2007-04-04 17:15:59 UTC

Created attachment 85812 [details]
Simple sample test web page.

Comment 6 Rich Burridge 2007-04-04 17:23:18 UTC

Created attachment 85815 [details]
Evolution mail folder with simple sample message in it.

Comment 7 Rich Burridge 2007-04-04 17:28:30 UTC

Created attachment 85816 [details] [review]
Patch to implement this feature for gedit and OOo Writer.

Patch not committed yet.

textLines() now uses TEXT_BOUNDARY_SENTENCE_END instead of
TEXT_BOUNDARY_LINE_START to break apart chunks to read (thanks Will!)

Fix appears to work nicely for gedit and OOo Writer although I haven't
tried it with a load of abbreviation yet.

It's not working for Evolution yet. We have defined our own
method in the Evolution.py script because Evolution does not
implement the FLOWS_TO relationship and all the text are in an
HTML panel which contains multiple panels, each containing a
single text object. This needs to be fixed up to hange speaking
by sentence. That's what I'm about to look at.

It's not working with today's Firefox 3.0.0(a4) build, using the
sample.html attachment to the bug. I think I need a little help from
Joanie or Will on this.

Comment 8 Joanmarie Diggs (IRC: joanie) 2007-04-04 20:26:55 UTC

Cool Rich!!!!!!!!!! (sorry, couldn't help myself after reading the sample ;-))

> It's not working with today's Firefox 3.0.0(a4) build, using the
> sample.html attachment to the bug. I think I need a little help 
> from Joanie or Will on this.

Actually, this is jogging my memory:  At CSUN weren't we talking about SayAll being totally broken in Firefox?  I don't see that in planning.ods.  That should be in our top 10, I assume.

Comment 9 Willie Walker 2007-04-04 21:27:37 UTC

> Actually, this is jogging my memory:  At CSUN weren't we talking about SayAll
> being totally broken in Firefox?  I don't see that in planning.ods.  That
> should be in our top 10, I assume.

SayAll is in our top 18 (at least in the document I've been editing today, but haven't had a chance to work on much since the phone has been in my ear all day and I've been unfortunately working on other things).  But, given the conversation I've had with Rich today, it seems as though this bug should be broken into several bugs.  One each for gedit, OOo, FF, Evolution, etc.

Comment 10 Rich Burridge 2007-04-05 17:40:58 UTC

Created attachment 85863 [details] [review]
Patch to implement this feature for gedit, OOo Writer and now for Evolution.

Patch not committed yet. Please let me know of any problems.

Comment 11 Willie Walker 2007-04-05 20:44:04 UTC

(In reply to comment #10)
> Created an attachment (id=85863) [edit]
> Patch to implement this feature for gedit, OOo Writer and now for Evolution.
> 
> Patch not committed yet. Please let me know of any problems.

I'm not sure the getTextAtOffset with sentence boundary start is going to do the trick for us, but we need Mike's opinion on this.  Enter the following in gedit:

I'm a monkey. 

I'm a
monkey 

What we have is a sentence followed by a sentence with a forced newline in it.  Based upon what I'm hearing when I do a Say All, I think the getTextAtOffset implementation is assuming the newline between "I'm a" and "monkey" is a sentence boundary.  

Mike - should this be a sentence boundary, or do we need to create a new hueristic for sentence boundaries that actually works?

Comment 12 Willie Walker 2007-04-05 20:57:02 UTC

BTW, the main motivation for this bug is to get the speech engine chunks of text where those chunks are broken on sentence boundaries.  The idea is to let the speech synthesis engine formulate a more natural F0 contour and inject pauses at the appropriate spots.

Since most synthesis engines have their own sentence logic, we could potentially just give the engine larger chunks of text as long as we're sure we don't break the sentence boundaries.  For example, we might toss the engine all the text from an object instead of a substring.  Or, we could try to detect paragraph boundaries, etc.

I'm not sure if that makes the problem any easier, but knowing the real goal sometimes can help.

BTW, I think I also remember Malte mentioning something about the accessibility hierarchy of OOo only giving us information about the visible text in the window.  Thus, Say All may potentially end when it reaches a window boundary. I don't know the full details of what OOo is doing, though, so it might take some experimentation.

Comment 13 Rich Burridge 2007-04-06 00:51:38 UTC

Created attachment 85875 [details] [review]
Revised patch to work better with gedit.

It seems to handle Will's new test case for gedit now. Plus the 
sample.txt from the bug.

New version of the patch attached. 

I tested it with a sample file of:

------ START ------
I'm a monkey.

I'm a
monkey

I'm another monkey

I'm another
monkey.

I'm a third monkey!
------ END ------

Comment 14 Mike Pedersen 2007-04-06 16:26:38 UTC

I really don't think we need to wory about sentences where someone has forced a new line in the middle of a sentence.  Unless I'm missing some use case where some one would choose to do this I say don't wory about it.  The cases you are currently handling will make for a much improved user experience.

Comment 15 Rich Burridge 2007-04-06 21:06:22 UTC

Created attachment 85928 [details] [review]
Revised patch to fix an Evolution problem.

Mike tested out the previous patch and found a problem with an
Evolution message. Hopefully that's fixed now. I also found a
problem where the final part of a gedit file might not have 
been spoken if it didn't end with a sentence break. Hopefully
that problem is also fixed.

This new patch is not committed yet. 

Please let me know of any remaining problems. Thanks.

Comment 16 Mike Pedersen 2007-04-06 21:21:43 UTC

Hey Rich, if you are comfortable with this what do you think about checking it in and letting others beat on it.  I think it's good but I'd like to get some more opinions.

Comment 17 Rich Burridge 2007-04-06 22:18:39 UTC

Patch committed.

Comment 18 Joanmarie Diggs (IRC: joanie) 2007-04-07 17:37:10 UTC

Created attachment 85960 [details] [review]
proposal to handle some breakage that occurred as a result of using sentence boundaries

Rich, regarding the issue Hermann pointed out, what about the attached?

Comment 19 Joanmarie Diggs (IRC: joanie) 2007-04-07 19:37:24 UTC

Rich is on vacation, but emailed me to suggest that if the above patch fixed the gnome-terminal/w3m issue reported by Hermann and preserved the sayAll by sentence functionality Rich added for Gedit, OOo Writer, and Evolution, I should go ahead and commit it.  I tested all four scenarios and things look good so the patch has been committed.  Thanks Rich!

The patch just falls back on TEXT_BOUNDARY_LINE_START when TEXT_BOUNDARY_SENTENCE_END fails.  That of course means that in the apps where it fails, sayAll will be by line unless we add special handling.  But by line is better than silence.  :-)  

I'm working on the Firefox sayAll.

Comment 20 Rich Burridge 2007-04-08 19:15:25 UTC

Couple things noted here so I don't forget them.

1/ W.r.t. the last patch, for efficiency, we probably want to do adjust it
   to call something like:

    mode = None
    [string, startOffset, endOffset] = text.getTextAtOffset(offset,
                             atspi.Accessibility.TEXT_BOUNDARY_SENTENCE_END)

    if string:
        mode = atspi.Accessibility.TEXT_BOUNDARY_SENTENCE_END
    else:
        mode = atspi.Accessibility.TEXT_BOUNDARY_LINE_START

    just once at the start of the textLines() routine in default.py, and then
    use mode in the text.getTextAtOffset() in the main loop of the routine.

2/ If gnome-terminal can't handle TEXT_BOUNDARY_SENTENCE_END calls, then we
   can probably move the code in gedit.py back into default.py and generalize
   it to handle gedit, OOo, gnome-terminal and others as we come across them.

If nobody beats me to it, I'll have a look at this on 16th April.

Comment 21 Rich Burridge 2007-04-12 22:35:19 UTC

Even though I'm on vacation this week, I still occasionally think
about what remains.

I have a question for Mike (and possibly Joanie). If an application
doesn't support reading my sentence (like gnome-terminal), should we
still try to force sentence handling heuristics on it, or just let it
return line by line like it currently does?

My worry here is that trying to read something like the output from
configure in a gnome-terminal window "by sentence", is going to produce
some interesting bogus results.

My second thought is that "if it ain't broke, don't fix it". In other
words, what's currently checked in works (modulo a decision to speak
application like gnome-terminal "by sentence". Therefore I don't believe
we really need the suggested changes by me in comment #20, and time should
be spent on other things. 

Thoughts?

Comment 22 Joanmarie Diggs (IRC: joanie) 2007-04-12 23:22:41 UTC

> My worry here is that trying to read something like the output from
> configure in a gnome-terminal window "by sentence", is going to produce
> some interesting bogus results.

True.... But.... Would those results be any more interestingly bogus than, say, reading an Orca script in Gedit with sayAll? 

There are some things that lend themselves to being read by sentence and some things that lend themselves to being read by line.  You encounter both in gnome-terminal, and in gedit, and on the web, and.... Therefore, I'll leave your "should we" question to Mike and toss out another one:  Regardless of what is decided w.r.t. forcing sentence handling heuristics,  should we make a sayAllBy setting so that the user can decide whether to read the current text by sentence or by line based on personal preference and the task at hand?

Now go do something vacationy. ;-) ;-)

Comment 23 Rich Burridge 2007-04-12 23:36:42 UTC

Then this begs the question why don't we have an Orca preference checkbox
that says something like:

  [X] Say all by sentence

with the default option being checked? If this is a good idea, then I'll
leave you and Mike to suggest a better wording.

Comment 24 Willie Walker 2007-04-12 23:40:50 UTC

(In reply to comment #23)
> Then this begs the question why don't we have an Orca preference checkbox
> that says something like:
> 
>   [X] Say all by sentence
> 
> with the default option being checked? If this is a good idea, then I'll
> leave you and Mike to suggest a better wording.
> 

Neat idea.  To extend it,

SayAll by: [combo box]

Where combobox choices include: line, sentence, paragraph, ?

Comment 25 Rich Burridge 2007-04-12 23:47:34 UTC

Sounds good to me too. Mike/Joanie: need to know the exact
wording, which pane (and where on the pane) it should go and
the combo box choices we want. Maybe initially just line and
sentence as we know how to do those. Unless paragraph is simple
too...

I'll then implement when I return next week.

Comment 26 Joanmarie Diggs (IRC: joanie) 2007-04-13 00:20:24 UTC

> Sounds good to me too. Mike/Joanie: need to know the exact
> wording, which pane (and where on the pane) it should go and
> the combo box choices we want. Maybe initially just line and
> sentence as we know how to do those. Unless paragraph is simple
> too...

*If* the only thing we're ever going to make configurable about SayAll is the unit by which to speak, then I think that Will's suggestion of 

    SayAll by: [combo box]

makes sense.  And I think it would live very happily on the Speech pane towards (if not at) the end.

But I was thinking.... Are there other things we might want to configure about sayAll?  For instance, some people I know work at a punctuation level of Most or All with the speaking of blank lines enabled because they want to know exactly how a document/email/whathaveyou is formatted.  But when doing a sayAll to sit back and read a document purely for its content, those settings are not ideal.   And maybe some people want to have the announcement when they enter or leave a table in a Writer document during a sayAll, but others do not.  I bet if I gave it some more thought, I could come up with other examples of settings that a user would always want to apply in a sayAll that they would not want to apply when not in a sayAll.

If we decide that we do want to add such options specifically to SayAll, then perhaps instead we want a special SayAll pane in the Preferences dialog?

On another note, I just noticed that sayAll in Writer is chopping off the last character as it reads.  I hadn't noticed that before because my documents normally have punctuation terminating sentences. :-)

Comment 27 Rich Burridge 2007-04-13 01:15:22 UTC

> If we decide that we do want to add such options specifically to SayAll, then
> perhaps instead we want a special SayAll pane in the Preferences dialog?

We might want to do this in stages. Just implement the combo box
for now, and file rfe(s) on the other things.

> On another note, I just noticed that sayAll in Writer is chopping off the last
> character as it reads.  I hadn't noticed that before because my documents
> normally have punctuation terminating sentences. :-)

Yup. If you look at the end of the textLines() routine in the gedit.py
script, you'll see that I had to add in a chunk'o'code for that scenerio.
It needs to be added to the default.py textLines() code, or better yet,
I think I should cave in and push the gedit code back to default.py, add back
in the hacks that Will had to put in for bogus behaviour, and just fixup the
one routine.

More next week when we've decided on the sayAll combo box placement/wording.

Mike, time for you to chime in... ;-)

Comment 28 Mike Pedersen 2007-04-13 16:56:05 UTC

I think the "sayall by" option is a good idea is the user will be able to decide per application how they want sayall to work.  Because we already have the ability to set punctuation and blank line reading per application I don't really want to add a complex sayall tab to the config UI at this point.  Lets just put a "sayall by" combo near the end of the speech tab which will containe line and sentence at this point.

Comment 29 Rich Burridge 2007-04-13 17:27:40 UTC

Thanks. I'll work on this on Monday.

Comment 30 Rich Burridge 2007-04-16 18:22:36 UTC

Created attachment 86446 [details] [review]
Patch to do the following.

Adds a "Say All By" combo box to the speech pane of the Orca
Preferences dialog. Current valid choices are "Line" and
"Sentence". Adjusts the existing textLines() routines to use it.

I have *NOT* committed this patch yet. I'd like a bit of feedback from
the Orca team on the patch before I inflict it on our users.

I also looked into the OOo Writer not speaking the last character when
doing a say all by sentence. It's YAOOOB. I tried it on a simple
OOo Writer document containing:

The quick
brown
fox jumps over
the lazy dog

The first call to:

[string, startOffset, endOffset] = text.getTextAtOffset(
                offset, atspi.Accessibility.TEXT_BOUNDARY_SENTENCE_END)

returns a string of

"The quic"

I'll file another OOo bug and open a new Orca bug to track it (and
update the OOo meta-bug).

After lunch.

Comment 31 Rich Burridge 2007-04-16 18:44:40 UTC

The OOo problem is OOo issue #76420.
http://www.openoffice.org/issues/show_bug.cgi?id=76420

This is being tracked with bug #430402
http://bugzilla.gnome.org/show_bug.cgi?id=430402

Comment 32 Willie Walker 2007-04-16 21:15:33 UTC

> [string, startOffset, endOffset] = text.getTextAtOffset(
>                 offset, atspi.Accessibility.TEXT_BOUNDARY_SENTENCE_END)
> 
> returns a string of
> 
> "The quic"

The implementation of getTextAtOffset for BOUNDARY_SENTENCE values is one of the more dicey things I've come across in the various AT-SPI implementations we work with.  I'm not sure anyone does it consistently, and perhaps only GTK+ gets it right.  As a result, there may need to be some other algorithm implemented for OOo (and Firefox) to get sentences.  

For OOo, maybe you could just pass entire paragraphs as the text to be spoken?

Comment 33 Rich Burridge 2007-04-16 21:45:58 UTC

Consider the following OOo Writer input:

“Hello. My name is Inigo Montoya. You killed my father. Prepare to die!" 
Hello
My name is Inigo Montoya
You killed my father
Prepare to die

Each of those lines is an object with role "PARAGRAPH". 
If Orca is speaking by sentence, it does a great job of handling 
the first line/paragraph, nicely breaking it into sentences. For 
the other four lines, it looses the last character because of the
problem I outline in comment #31.

<shrug>

I guess I could try to work around this in the StarOffice.py script.
Personally I'd prefer to see the bug fixed in OOo, and we spend the
time trying to implement a new feature or fix one of our own bugs.

But it's your call. Please let me know how you'd like me to proceed.

Thanks.

Comment 34 Joanmarie Diggs (IRC: joanie) 2007-04-16 22:25:51 UTC

Hey Rich.

In Evolution and Gedit, SayAll seems to be by sentence regardless of what setting I choose; in OOo Writer, it's working as expected.  

Minor request:  Could we have a mnemonic for SayAll By?  'y' perhaps?

Comment 35 Willie Walker 2007-04-16 22:33:24 UTC

> Each of those lines is an object with role "PARAGRAPH". 
> If Orca is speaking by sentence, it does a great job of handling 
> the first line/paragraph, nicely breaking it into sentences. For 
> the other four lines, it looses the last character because of the
> problem I outline in comment #31.

I definitely agree the problem should be fixed at the source (OOo).

> I guess I could try to work around this in the StarOffice.py script.
> Personally I'd prefer to see the bug fixed in OOo, and we spend the
> time trying to implement a new feature or fix one of our own bugs.

Let's see how the OOo folks respond.  If it is unlikely that they will fix it any time soon, then we should shoot for a workaround in Orca.

Comment 36 Rich Burridge 2007-04-16 23:24:27 UTC

> In Evolution and Gedit, SayAll seems to be by sentence regardless of what
> setting I choose; in OOo Writer, it's working as expected.

Hmm. Did you try it against the 4th and 1st attachments? Do those
fail for you too? Either way, I'll look at it again tomorrow.

> Minor request:  Could we have a mnemonic for SayAll By?  'y' perhaps?

That should be easy to do (assuming y isn't already in use).

Thanks for trying the patch.

Comment 37 Joanmarie Diggs (IRC: joanie) 2007-04-16 23:53:10 UTC

> Hmm. Did you try it against the 4th and 1st attachments? Do those
> fail for you too? Either way, I'll look at it again tomorrow.

For the Gedit and OOo tests I used your attachments.  (I'm going to have to read _The_Princess_Bride_ after hearing that quote spoken again and again. :-))

For Evolution I just read an existing message.  A bugzilla bug comment notification as I recall.

Comment 38 Rich Burridge 2007-04-17 14:53:39 UTC

Created attachment 86501 [details]
Orca debug output from doing a say all by sentence with the sample.txt

I've added some debug statements to the gedit.py script.
To me, this shows Orca speaking the sample.txt file by 
sentence.

Comment 39 Rich Burridge 2007-04-17 14:55:10 UTC

Created attachment 86503 [details]
Debug output from Orca when doing a say all by sentence in Evolution on the sample test message.

I added some debug statements to the Orca Evolution.py script.
To me, this shows that Orca is doing a say all by sentence in
Evolution when reading the sample test message attached to this bug.

Comment 40 Rich Burridge 2007-04-17 14:59:38 UTC

Created attachment 86504 [details] [review]
New version of the say all patch.

This patch is not committed yet. I'd like to get to the bottom
of why it works for me in sentence mode with gedit and Evolution
but not for Joanie.

To that end, I've added debug statements to the gedit.py and
Evolution.py scripts. Joanie, could you please try them with the two
sample that are attached to this bug and tell me if they are still
failing for you?

If you have other samples that fail, please attach them to this bug.

Thanks.

Comment 41 Joanmarie Diggs (IRC: joanie) 2007-04-17 15:05:01 UTC

You bet!  But perhaps I wasn't clear in comment 34.  SayAll by sentence works like a charm.  The issue I'm having is that when I choose SayAll by line in Gedit and Evolution, Orca doesn't pause at line boundaries like it used to.

Comment 42 Rich Burridge 2007-04-17 15:15:11 UTC

You're right. 

You weren't clear. ;-)

Okay, I'll look at that now. 

Thanks for clarifying.

Comment 43 Rich Burridge 2007-04-17 15:34:45 UTC

Created attachment 86508 [details] [review]
Revised version of patch.

This now fixes the problem of say all by line not pausing between
each line. I've also added in the Alt-y accelerator for the "Say
All By" combo box.

I've committed this patch. It'll help Joanie adjust the sayAll in
Firefox/Gecko to use the new setting. It'll make it easier for Mike
to try out.

We must be getting close now. Please let me know of any other problems you
find, or tweaks you'd like made.

Comment 44 Rich Burridge 2007-04-27 13:29:04 UTC

Created attachment 87142 [details]
Test file from Halim, where he says it doesn't stop in the right place.

Note that this might be because of problems in the eSpeak speech drivers
(both with and without speech dispatcher).

See:

http://mail.gnome.org/archives/orca-list/2007-April/msg00219.html
http://mail.gnome.org/archives/orca-list/2007-April/msg00223.html

Comment 45 Rich Burridge 2007-04-27 15:04:50 UTC

Halim tells me he's using:

"viavoice with gnomespeech (viavoice-synthesis-driver)"

Comment 46 Mike Pedersen 2007-05-03 15:42:28 UTC

This seems to be working well in the areas I've tested.  Gedit evolution, firefox.  The combo for changing to sayall by line also seems to work well.

Comment 47 Rich Burridge 2007-05-03 18:14:12 UTC

I don't have ViaVoice but I tested Halim's test document in gedit 
with both sayAll by LINE and sayAll by SENTENCE using the swift driver
and had no problems with the text caret when I interrupted the sayAll
operation.

I too think this bug is ready for closing.

Comment 48 Joanmarie Diggs (IRC: joanie) 2007-05-08 21:19:36 UTC

I just tried Halim's document.  Orca's general sayAll behavior is to stop no more than a word or so away -- if not right on the word that was last spoken.  With this document, at least within the table of contents portion, it seems to be a line or two away.  I tried by Sentence using Cepstral Swift.

On another note, in Evolution, try the following:

1. Compose a new message
2. Put your name (or something that does not end with punctuation) as the last line of the message
3. Review the message with sayAll set to sentences.  It doesn't read the name.  (It will if you put a period at the end of the name and/or do sayAll by line)

Comment 49 Rich Burridge 2007-05-10 14:00:56 UTC

I just tried to reproduce the Evolution problem in comment #48.
I was using the following in my Evolution Compose window:

----

"Hello. My name is Inigo Montoya. You killed my father. Prepare to die!"

The saying, "Less is more" rings true in the case of exclamation marks! One will suffice for almost any occasion, and forming a small army of exclamation marks to attack your reader with excruciating force is entirely unnecessary. Another appropriate analogy would be the boy who cried exclamation mark.

Something at the end
which doesn't end in a sentence
break

----

Orca spoke it all.

Joanie, can you attach the Evolution text that's failing for you please?

As you are able to reproduce this, could you also try adding in the 
following line at the very end of thetextLines() routine in Evolution.py 
(at about 465):

        print "Evolution: textLines: at end: length of string: ", len(string)

I'm curious to see if it's a non-zero value.

Thanks.

Comment 50 Joanmarie Diggs (IRC: joanie) 2007-05-10 14:21:13 UTC

I took that text, pasted it into Evolution, deleted the line breaks in the paragraph beginning with "The saying" and gave it a shot.

1. It stopped speaking after "exclamation mark."
2. It didn't update the caret -- with my previous example* it stopped just before the last line.
3. The answer is 56.

This is with the message composition window in plain text and with sayAll done by sentence.

In terms of a sample:
---------------
This is a test.

Thanks.
jd
---------------
It stops at the period after "Thanks." and the answer is "2"

Comment 51 Rich Burridge 2007-05-10 16:54:55 UTC

Created attachment 87973 [details] [review]
Fix for the Evolution problem reported in comment #48.

This patch is committed. In testing this, Joanie found another
problem:

"The other thing I noticed just now with this message is that sayAll from
the very top doesn't work.  When I start it from the blank line at the
top, Orca is silent.  If I move to the quoted line, it's fine.  If I
delete the blank line so that the quoted line is the first line, I
cannot start sayAll successfully from the quote.  This seems to be the
culprit:

vvvvv PROCESS KEY PRESS EVENT + vvvvv
evolution.sayAll.

Traceback (most recent call last):

+ Trace 133520

File "/usr/lib/python2.5/site-packages/orca/input_event.py", line 178 in processInputEvent
```
consumed = self._function(script, inputEvent)
```
File "/usr/lib/python2.5/site-packages/orca/scripts/Evolution.py", line 518 in sayAll
```
self.__sayAllProgressCallback)
```
File "/usr/lib/python2.5/site-packages/orca/speech.py", line 124 in sayAll
```
_speechserver.sayAll(utteranceIterator, progressCallback)
```
File "/usr/lib/python2.5/site-packages/orca/gnomespeechfactory.py", line 843 in sayAll
```
[context, acss] = utteranceIterator.next()
```
File "/usr/lib/python2.5/site-packages/orca/scripts/Evolution.py", line 422 in textLines
```
accTextObj = panel.accessible.getChildAtIndex(0)
```

AttributeError: 'NoneType' object has no attribute 'accessible'

"

I'll investigate this problem after I've looked at the other couple 
I'm currently working on.

Comment 52 Rich Burridge 2007-05-11 15:04:20 UTC

Created attachment 88018 [details] [review]
Hopefully a fix for latest Evolution sayAll traceback.

Not committed yet. 

Joanie, as I can't reproduce this traceback, could you try out the patch
please? It also fixes another problem where there needed to be spaces
inserted between the concatenation of lines in SENTENCE mode.

Thanks.

Comment 53 Joanmarie Diggs (IRC: joanie) 2007-05-11 16:18:30 UTC

Yup, thanks fixes it.  Thanks!

Comment 54 Joanmarie Diggs (IRC: joanie) 2007-05-11 16:23:12 UTC

Found another issue with sayAll in Evolution (sorry!).  If you start it from a blank line it doesn't start.  There are no errors.

Try composing a new message with the following structure:

<blank>
This is a test.
<blank>
So is this.

If you start the sayAll from either of the blank lines, it doesn't work.  If you start from either of the lines with text, it works as expected.

Comment 55 Rich Burridge 2007-05-11 16:58:26 UTC

Patch committed. I'll look at your latest Evolution sayAll next.
Thanks!

Comment 56 Rich Burridge 2007-05-11 17:55:42 UTC

Created attachment 88030 [details] [review]
Patch to fix the problem reported in comment #54

Patch committed.

Comment 57 Rich Burridge 2007-05-14 15:55:52 UTC

Created attachment 88161 [details]
Various Orca print debug statement to help track down the interrupt problem.

Comment 58 Rich Burridge 2007-05-14 15:57:26 UTC

Created attachment 88162 [details]
Orca run time output when debugging the interrupt problem.

Commentary to follow.

Comment 59 Rich Burridge 2007-05-14 16:23:48 UTC

I've been investigating why doing a "say all" by sentence on the
file that Halim provided:
http://bugzilla.gnome.org/attachment.cgi?id=87142
and then interrupting it, causes the caret to be in the wrong place.

I've attached two files:

1/ Various Orca print debug statements to Orca in SVN trunk/HEAD, to
   help track down the interrupt problem.
   http://bugzilla.gnome.org/attachment.cgi?id=88161&action=edit

2/ Orca run time output from these statements, when debugging the 
   interrupt problem.
   http://bugzilla.gnome.org/attachment.cgi?id=88162

Steps to reproduce:

1/ Start Orca. Make sure that we are doing "say all" by sentence.
2/ Start gedit showing the file in attachment
   http://bugzilla.gnome.org/attachment.cgi?id=87142
3/ Position the cursor at the beginning of the text file.
4/ Press numpad-Plus key to start the "say all"
5/ Use the Control key to interrupt the "say all".

When doing "say all" by sentence the second "sentence" is:

             Introduction
          + [2]1.1 Features
     * [3]2.

I hit the Control key just after the "2" in the following line 
had been spoken:

          + [2]1.1 Features

The text caret was positioned at the very end of the same line instead of
after the 2, after the interrupt.

The text object that we are doing the "say all" on, is the whole of
the file (74891 characters).

The line in question:

          + [2]1.1 Features

starts at offset 204 and ends at line 232

In the __speak() routine in gnomespeechfactory.py, there are calls like:

        text = self.__addVerbalizedPunctuation(text)
        if orca_state.activeScript:
            text = orca_state.activeScript.adjustForPronunciation(text)

so what starts out as:

             Introduction
          + [2]1.1 Features
     * [3]2.

ends up as the following text that is sent to the TTS:

` Introduction          plus left bracket 2 right bracket 1 dot 1 Features
     star left bracket 3 right bracket 2.`

The __idleHandler() routine in gnomespeechfactory.py is being called
to provide progress feedback as the text is being called. This is a
gidle routine. In other words, it's called on the gidle thread when there
is nothing else to do in any of the other Orca threads.

What we are seeing from the generated debug output is that there are
several calls to the __idleHandler() routine the last of them being:

gsf: __idleHandler called. 
_iH: context.startOffset:  191
_iH: offset:  41
_iH: calling progress callback: offset:  232

Each of these in turn call the __sayAllProgressCallback() routine in
the gedit.py script. Therefore, the last offset given is 232 (the end 
of the

          + [2]1.1 Features

line).

Then the gedit __sayAllProgressCallback() routine is called again,
this time because the "say all" operation has been interrupted.
The last offset was 232, so the caret is positioned at that (wrong) 
point.

The __idleHandler() routine in gnomespeechfactory.py is calling:

        (id, type, offset) = self.__eventQueue.get()

each time to get the latest "say all" text caret offset.
It looks like these events are put on the event queue in the notify() 
routine in the SpeechServer class in gnomespeechfactory.py:

            self.__eventQueue.put((id, type, offset))

This in turn is called from the notify() routine in the _Speaker class
in gnomespeechfactory.py. This routine is called by GNOME Speech when 
the GNOME Speech driver generates a callback.

So, in short, I think we are losing synchronisity between GNOME Speech
and Orca here.

Will, as you are our GNOME Speech expert, what should be done next to 
find out what's going wrong here? 

Thanks.

Comment 60 Willie Walker 2007-05-14 19:16:13 UTC

> In the __speak() routine in gnomespeechfactory.py, there are calls like:
> 
>         text = self.__addVerbalizedPunctuation(text)
>         if orca_state.activeScript:
>             text = orca_state.activeScript.adjustForPronunciation(text)
> 
> so what starts out as:
> 
>              Introduction
>           + [2]1.1 Features
>      * [3]2.
> 
> ends up as the following text that is sent to the TTS:
> 
> ` Introduction          plus left bracket 2 right bracket 1 dot 1 Features
>      star left bracket 3 right bracket 2.`

This is probably the problem.  The index in the gnome-speech progress callback is an index into the string that was passed to gnome-speech.  

When the __idleHandler of gnomespeechfactory.py looks at the offset being passed to it, it is looking at the offset into the string that was passed to gnome-speech.  It then mistakenly maps this to the original string, which we see from the above are two different strings:

original: 
  Introduction + [2]1.1 Features * [3]2.

passed to gnome-speech: 
  Introduction plus left bracket 2 right bracket...

So...it seems as though there might need to be some sort of reverse mapping to handle the mapping of a character index in the string passed to gnome-speech to the index of a character index into the original text.  Not quite sure of a way to do this -- hope you can come up with a good idea.

Comment 61 Rich Burridge 2007-05-14 20:12:03 UTC

Thanks. Ack! This isn't simple. I'm not coming up with anything
at the moment. I'll keep thinking about it.

Comment 62 Rich Burridge 2007-05-17 16:16:13 UTC

After talking with Will, I've opened bug #439191 on the 
"sayAll by sentence can position the text cursor in the 
wrong place when interrupted" problem reported by Halim.

I believe the rest of the sayAll function is now working
and I'm closing this bug as FIXED. If other problems occur, 
then we can open up new individual problems for them.