Bug 562060 – speech should speak multi case strings as words

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 562060 - speech should speak multi case strings as words


Summary:	speech should speak multi case strings as words


Status:	RESOLVED FIXED

Product:	orca
Classification:	Applications
Component:	speech
Version:	2.24.x
Hardware:	Other All

Importance:	Normal enhancement
Target Milestone:	2.26.0
Assigned To:	Willie Walker
QA Contact:	Orca Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2008-11-23 22:10 UTC by Mesar Hameed
Modified:	2009-03-10 14:25 UTC

See Also:
GNOME target:	---
GNOME version:	2.23/2.24

Attachments
revision 1 (5.07 KB, patch) 2008-11-23 22:14 UTC, Mesar Hameed	reviewed	Details \| Review
revision 2 (5.09 KB, patch) 2008-11-26 22:50 UTC, Mesar Hameed	reviewed	Details \| Review
revision 3 (5.29 KB, patch) 2009-01-17 14:25 UTC, Mesar Hameed	none	Details \| Review
revision 4 (6.61 KB, patch) 2009-01-17 15:25 UTC, Mesar Hameed	none	Details \| Review
revision 5 (6.81 KB, patch) 2009-01-17 17:44 UTC, Mesar Hameed	needs-work	Details \| Review
Adjusted patch -- fixes bug and also modifies the string presented in the UI (6.96 KB, patch) 2009-01-19 15:42 UTC, Willie Walker	committed	Details \| Review
Patch to fix the speakCharacter pronunciation problem (1.22 KB, patch) 2009-01-22 00:36 UTC, Willie Walker	committed	Details \| Review
Proposed patch to resolve the NVDAHelper.py case (494 bytes, patch) 2009-01-22 16:12 UTC, Rui Batista	none	Details \| Review
rev 8, includes Rui's regexp, and rewrites the first 2 regexps. (959 bytes, patch) 2009-01-27 12:06 UTC, Mesar Hameed	reviewed	Details \| Review
Patch to resolve speakKeyName pronunciation issue from comment #21 (766 bytes, patch) 2009-01-27 16:41 UTC, Willie Walker	committed	Details \| Review
Patch to address the pidgin problem (I hope) (1.30 KB, patch) 2009-01-27 21:47 UTC, Willie Walker	committed	Details \| Review

Description Mesar Hameed 2008-11-23 22:10:30 UTC

Please describe the problem:

In programming, (and possibly other places), we might see LongLineOfText or myNAMEisJoBlogs

speech should try to break up the string into words, for a better audibility experience.

patch coming up

Steps to reproduce:


Actual results:


Expected results:


Does this happen every time?


Other information:

Comment 1 Mesar Hameed 2008-11-23 22:14:51 UTC

Created attachment 123286 [details] [review]
revision 1

a patch that adds the functionality, and a checkbox to the speech tab, to speak multiCaseStringsAsWords.

Please pylint and test.

Thank you

Comment 2 Mike Pedersen 2008-11-26 13:52:27 UTC

I agree with Jon that there are times when this functionality could be really useful.

Comment 3 Joanmarie Diggs (IRC: joanie) 2008-11-26 17:11:02 UTC

I took a quick look at this. In terms of pylinting, there's a bit of "lint" in speech.py:

~~~~~~~~~~~~~~~~~~~
C0301:366: Line too long (91/80)
W0622:258:speakUtterances: Redefining built-in 'len'
C0324:367:_processMultiCaseString: Comma not followed by a space
    string = multiCaseReg1.sub('\\1 \\2',string)
                                        ^^
C0324:368:_processMultiCaseString: Comma not followed by a space
    string = multiCaseReg2.sub('\\1 \\2',string)
                                        ^^
~~~~~~~~~~~~~~~~~~~

In terms of functionality, this change seems to cause all strings which begin with a capital letter to be preceded by a space (Welcome to Orca, Terminal frame, etc.). While this won't sound different, it will totally throw off our regression tests.

Also, splitting words up in this fashion can cause the pronunciation dictionary to not be used when expected and (though admittedly less likely) to be used when unexpected.  Without your patch, if I make an entry so that "OOo" is spoken as "Open Office", I hear "Open Office" for the following:

1. OOo
2. OOO
3. ooo

With your patch, I hear "Open Office" for the latter two, but not the mixed case version because it was split up into two words before hitting the pronunciation dictionary. On the flip side, If I wrote out "oh" in a long, drawn-out fashion in order to stress slowing coming to a realization, "OOOhhhh" becomes "OpenOffice 4 h's". Yes, this latter issue is a lame stretch. :-) The former issue is worth thinking about, I think.

On a more significant note, this seems to assume the mixed-case string is only made up of characters A through Z. What about other alphabets and accented letters? For instance, 'JALAPEÑOS' becomes two words (' JALAPEÑ OS') if written in all caps.

Assigning this RFE to you, btw. :-)

Comment 4 Mesar Hameed 2008-11-26 22:50:45 UTC

Created attachment 123502 [details] [review]
revision 2

Thank you Joanie, Mike, for the feedback

1. unicode support:
according to: 
http://www.regular-expressions.info/python.html

"Python's re module does not support any 
Unicode regular expression tokens" :-(

and looking at:
http://www.python.org/doc/2.5.2/lib/re-syntax.html

Sadly there doesnt seem to be a [[:upper:]] class, which could be based on locale to determine what is currently considered as an uppercase.

It allows for unicode words, but does not give us the power to distinguish between upper and lower unicodes. (from the half an hour i googled, i might be wrong). Might continue searching for a while tomorrow.

We can add in particular (unicode characters) letters that we consider to be upper case, (as users complain) but this seems like a long path to walk. Ideally we would ask the regexp module for all unicode chars that are considered as uppercase, then build our rules from that. But python regular expression powers are limited. perl and .net regexp engens seem to be more capable in the unicode department.

I guess we could warn the user by giving a health warning with the checkbox, saying that it is known to be broken with non a-z A-Z alphabets. :( and may sometimes alter the pronunciation of words.

Here is another example: "ThisIsALongLine" which would split up into "This Is AL ong Line"
Should we consider two upper case letters followed by lower case as two words?
upper, upper lower *



2. with regard to the pronunciation dictionary problems, would it be better to do the word splitting after we apply the dictionary? (I am currently in favour of doing this, since this eliminates most of the pronunciation problems Joanie pointed out. From a quick grep only the gnome-speech backend seems to do the adjustForPronunciation, is this intentional?
Should the call to adjustForPronunciation be done in the speech.speak() or simular, because this should be independant of speech output? the idea:

def speech.speak(text):
    text = adjustForPronunciation(text)
    text = _processMultiCaseString(text)
    speakIt(text)

I havent done anything in the attached patch with regard to the above, just putting my thoughts down.


3. the attached patch hopefully is better in the pylinting department, and fixes the regressions, Thank you Joanie for pointing this out.

Comment 5 Willie Walker 2009-01-15 21:20:51 UTC

(In reply to comment #4)
> "Python's re module does not support any 
> Unicode regular expression tokens" :-(
> 
> and looking at:
> http://www.python.org/doc/2.5.2/lib/re-syntax.html
> 
> Sadly there doesnt seem to be a [[:upper:]] class, which could be based on
> locale to determine what is currently considered as an uppercase.

:-(

I'm OK with walking the long path for now and getting things working for a-z and A-Z.  Otherwise, we might suffer from paralysis by analysis.

> Here is another example: "ThisIsALongLine" which would split up into "This Is
> AL ong Line"
> Should we consider two upper case letters followed by lower case as two words?
> upper, upper lower *

This sounds fair, though I wonder what to do when then dealing with words that begin with two capital letters.  SOmething like IMade a MIstake when TYping.

I also looked at the parsing of "ThisIsAnMultiCaseString" (I put the "An" in there just as a means to avoid the two capital letters in a row problem).  It doesn't seem to split things completely and I ended up with this: This IsAn MultiCase String.

I also wonder if we need to take care not to split words that weren't meant to be split.  For example, last names such as McLovin.

> 2. with regard to the pronunciation dictionary problems, would it be better to
> do the word splitting after we apply the dictionary? (I am currently in favour
> of doing this, since this eliminates most of the pronunciation problems Joanie
> pointed out. From a quick grep only the gnome-speech backend seems to do the
> adjustForPronunciation, is this intentional?

I wonder if we might need to do it pre and post split?

> Should the call to adjustForPronunciation be done in the speech.speak() or
> simular, because this should be independant of speech output?

Well...the SpeechDispatcher folks have their own ideas of how things should be done.  So, we steered clear of stepping in their space and put things in the gnomespeechfactory.

In any case, this patch seems really close.

Comment 6 Mesar Hameed 2009-01-17 14:19:27 UTC

Thank you will, nesting regexps didnt work in python as i expected.
---

example to show that the regexps in the patch to be attached works as advertized:

def mysub(a):
    multiCaseReg1 = re.compile("([a-z]+)([A-Z][a-z]+)")
    multiCaseReg2 = re.compile("([a-z]+)([A-Z]+)")
    multiCaseReg3 = re.compile("([A-Z]{2}[A-Z]+)([a-z]+)")
    multiCaseReg4 = re.compile("([A-Z])([A-Z][a-z]+)")
    a1 = multiCaseReg1.sub('\\1 \\2',a)
    a2 = multiCaseReg2.sub('\\1 \\2',a1)    
    a3 = multiCaseReg3.sub('\\1 \\2',a2)    
    a4 = multiCaseReg4.sub('\\1 \\2',a3)
    print("a=%s\na1=%s\na2=%s\na3=%s\na4=%s\n" % (a,a1,a2,a3,a4))



In [10]: a = "TESTout ThisIsALongMultiCaseString"

In [11]: mysub(a)
a=TESTout ThisIsALongMultiCaseString
a1=TESTout This IsALong MultiCase String
a2=TESTout This Is ALong Multi Case String
a3=TEST out This Is ALong Multi Case String
a4=TEST out This Is A Long Multi Case String

Comment 7 Mesar Hameed 2009-01-17 14:25:03 UTC

Created attachment 126644 [details] [review]
revision 3

promised patch, pylinted to 10 :) thanks Will and opensolaris

Comment 8 Mesar Hameed 2009-01-17 15:25:49 UTC

Created attachment 126647 [details] [review]
revision 4

(In reply to comment #5)
> > Here is another example: "ThisIsALongLine" which would split up into "This Is
> > AL ong Line"
> > Should we consider two upper case letters followed by lower case as two words?
> > upper, upper lower *
> This sounds fair, though I wonder what to do when then dealing with words that
> begin with two capital letters.  SOmething like IMade a MIstake when TYping.
A word starting with two capital letters is wrong, so we do expect it to sound diffrently, this will also help the user to detect mistakes. (or at least it helps me much)


> I also looked at the parsing of "ThisIsAnMultiCaseString" (I put the "An" in
> there just as a means to avoid the two capital letters in a row problem).  It
> doesn't seem to split things completely and I ended up with this: This IsAn
> MultiCase String.

yes appologies, this is now fixed in the above patch.

> I also wonder if we need to take care not to split words that weren't meant to
> be split.  For example, last names such as McLovin.

McLovin and Mc Lovin sound the same, so its ok if it is split.

> > 2. with regard to the pronunciation dictionary problems, would it be better to
> > do the word splitting after we apply the dictionary? (I am currently in favour
> > of doing this, since this eliminates most of the pronunciation problems Joanie
> > pointed out. From a quick grep only the gnome-speech backend seems to do the
> > adjustForPronunciation, is this intentional?
> I wonder if we might need to do it pre and post split?
> > Should the call to adjustForPronunciation be done in the speech.speak() or
> > simular, because this should be independant of speech output?
> Well...the SpeechDispatcher folks have their own ideas of how things should be
> done.  So, we steered clear of stepping in their space and put things in the
> gnomespeechfactory.

aah, since i wrote that comment, we already have adjustForPronunciation in speech-dispatcher, due to bug #562877.
It has now been unified for gnome-speech and speech-dispatcher.

As Will suggests, We adjust for multiCaseStrings pre and post adjustForPronunciation.

sorry for the spam

Comment 9 Mesar Hameed 2009-01-17 17:44:08 UTC

Created attachment 126653 [details] [review]
revision 5

fix, i forgot to include the multi case string check when we are speaking a utterances list.

Comment 10 Willie Walker 2009-01-19 15:06:16 UTC

(In reply to comment #9)
> Created an attachment (id=126653) [edit]
> revision 5
> 
> fix, i forgot to include the multi case string check when we are speaking a
> utterances list.

Something seems wrong with this section of code (it causes a traceback):

+    i = 0
+    length = len(utterances)
+    while ( i < length ):
+        utterance = utterances[i]
+        if settings.speakMultiCaseStringsAsWords:
+            text = _processMultiCaseString(text)
+        if settings.speakMultiCaseStringsAsWords:
+            utterances[i] = _processMultiCaseString(utterance)
+            utterance = utterances[i] 
+        if settings.speakMultiCaseStringsAsWords:
+            text = _processMultiCaseString(text)
+        i = i + 1
+        

The problem is that the 'text' field is used before it is assigned.  I'm wondering if this is a cut/paste issue and if 'utterance' should be used instead of 'text' in this chunk of code?

Comment 11 Willie Walker 2009-01-19 15:42:54 UTC

Created attachment 126767 [details] [review]
Adjusted patch -- fixes bug and also modifies the string presented in the UI

This patch fixes the bug and also modifies the string presented in the UI.

Comment 12 Mesar Hameed 2009-01-21 09:53:16 UTC

Sorry Will 
*looks sheepish* imumble mumble mumble
:)
Thank you for saving me

Comment 13 Willie Walker 2009-01-21 15:15:51 UTC

No problem, Jon!  Many thanks for your work and magic with regular expressions here.

Just a followup - also added comments for translators to the glade file based upon feedback from Attila.

Comment 14 Willie Walker 2009-01-22 00:21:19 UTC

I believe this bug is causing an issue where the speakCharacter functionality of gnomespeechfactory.py is no longer using pronunciations.  The problem is that the pronunciation handling code was moved out of gnomespeechfactory.py and into speech.py, which is fine, except for individual characters, which go through a different code path (i.e., speech.speakCharacter and not speech.speak).

In addition, this is one of those spaces where we cross ideologies with different speech solutions.  gnomespeechfactory.py thinks it's OK for the client to change the pronunciation of characters.  speechdispatcher.py thinks the speech system should be in control and the client should never second guess.

So, I think a patch might need to be made to gnomespeechfactory.py:speakCharacter.  I'll work up a patch.

Comment 15 Willie Walker 2009-01-22 00:36:43 UTC

Created attachment 126959 [details] [review]
Patch to fix the speakCharacter pronunciation problem

This patch resolves the issue fine for me.  I checked it into trunk and gnome-2-24 for folks not savvy with the patch command.

Comment 16 Rui Batista 2009-01-22 11:20:11 UTC

(In reply to comment #15)
> Created an attachment (id=126959) [edit]
> Patch to fix the speakCharacter pronunciation problem
> 
> This patch resolves the issue fine for me.  I checked it into trunk and
> gnome-2-24 for folks not savvy with the patch command.
> 

Seems to fix at least the problems with portuguese voxin synth, reported in orca-list@gnome.org by Victor Oliveira.

Regards,
Rui Batista

Comment 17 Willie Walker 2009-01-22 12:55:50 UTC

(In reply to comment #16)
> Seems to fix at least the problems with portuguese voxin synth, reported in
> orca-list@gnome.org by Victor Oliveira.

Thanks!  Closing as FIXED.

Comment 18 Rui Batista 2009-01-22 16:09:54 UTC

Hi,

There is at least one muticase word case not covered by this patch:
for example, file names like NVDAhelper.py (from nvda source lol) should be broken as
"NVDA Helper.py"
Not 
"NVDAH elper.py"
The oposite case
"megaTTS"
Are handled correctly.
Atached proposed patch, please see if this doens't break anything.

Hi,

There is at least one muticase word case not covered by this patch:
for example, file names like NVDAhelper.py (from nvda source lol) should be broken as
"NVDA Helper.py"
Not 
"NVDAH elper.py"
The oposite case
"megaTTS"
Are handled correctly.
Atached proposed patch, please see if this doens't break anything.

Comment 19 Rui Batista 2009-01-22 16:12:50 UTC

Created attachment 127011 [details] [review]
Proposed patch to resolve the NVDAHelper.py case

Sorry, forgot to put this in the last comment

Comment 20 Willie Walker 2009-01-22 17:01:42 UTC

(In reply to comment #19)
> Created an attachment (id=127011) [edit]
> Proposed patch to resolve the NVDAHelper.py case
> 
> Sorry, forgot to put this in the last comment

Thanks!  I'm going to defer to Jon on the review for this one.  Jon, can you please take a look at this?

Comment 21 Jose Vilmar Estacio de Souza 2009-01-26 22:18:35 UTC

(In reply to comment #15)
> Created an attachment (id=126959) [edit]
> Patch to fix the speakCharacter pronunciation problem
> 
> This patch resolves the issue fine for me.  I checked it into trunk and
> gnome-2-24 for folks not savvy with the patch command.
> 

Hi, I found two situations in which the problem persists.

If I create in the pronunciation page b to be spoken as bingo, orca doesn't read bingo  when I am typing. However if I am reviewing the text, orca reads bingo as expected. 
If I create msn to be spoken as messenger orca reads msn, but only when I am navigating in pidging.

Comment 22 Mesar Hameed 2009-01-27 12:06:21 UTC

Created attachment 127319 [details] [review]
rev 8, includes Rui's regexp, and rewrites the first 2 regexps.

Hi Rui, Will,

Sorry for late reply

Yes, replacing regexp3 does solve the NVDAHelper.py problem, as shown by the output below.

its based on huristics, so we probably cant get it right, conflicting programming naming convensions etc.
#define GREEN_Colour  (0, 255, 0)
rather than:
#define GREENColour (0, 255, 0)


Anyway I think jaws also split it where you are suggesting:
TES Tout and TESTout sound the same

This is what we get with Rui's modification, and combining the old first two regexpes into 1. (Rui's regexp is now regexp2).


def mysub(a):
    multiCaseReg1 = re.compile("([a-z]+)([A-Z])")
    multiCaseReg2 = re.compile("([A-Z][A-Z]+)([A-Z][a-z]+)")
    multiCaseReg3 = re.compile("([A-Z])([A-Z][a-z]+)")
    a1 = multiCaseReg1.sub('\\1 \\2',a)
    a2 = multiCaseReg2.sub('\\1 \\2',a1)    
    a3 = multiCaseReg3.sub('\\1 \\2',a2)    
    print("a=%s\na1=%s\na2=%s\na3=%s\n" % (a,a1,a2,a3))


a = "TESTout ThisIsALongMultiCaseString"


In [17]: mysub(a)
a=TESTout ThisIsALongMultiCaseString
a1=TESTout This Is ALong Multi Case String
a2=TES Tout This Is ALong Multi Case String
a3=TES Tout This Is A Long Multi Case String

In [18]:

Comment 23 Willie Walker 2009-01-27 16:41:51 UTC

Created attachment 127335 [details] [review]
Patch to resolve speakKeyName pronunciation issue from comment #21

Comment 24 Willie Walker 2009-01-27 16:44:07 UTC

(In reply to comment #22)
> Created an attachment (id=127319) [edit]
> rev 8, includes Rui's regexp, and rewrites the first 2 regexps.

Rui - please test.  I have a feeling this is going to be one of those special-case-will-never-address-everyones-desires-bugs-from-hell.  But, if we can get close, it will be better than not at all.

Comment 25 Jose Vilmar Estacio de Souza 2009-01-27 19:09:04 UTC

(In reply to comment #23)
> Created an attachment (id=127335) [edit]
> Patch to resolve speakKeyName pronunciation issue from comment #21
> 

The first problem reported in comment 21 is fixed, but the problem related to pidgin is not fixed. 
Should I attach a debug file and/or provide more information?

Comment 26 Willie Walker 2009-01-27 19:15:49 UTC

(In reply to comment #25)
> (In reply to comment #23)
> > Created an attachment (id=127335) [edit]
> > Patch to resolve speakKeyName pronunciation issue from comment #21
> > 
> 
> The first problem reported in comment 21 is fixed, but the problem related to
> pidgin is not fixed. 
> Should I attach a debug file and/or provide more information?

Redefining msn to messenger works fine for me in pidgin, gedit, terminal, etc.  Please provide an exact description of the steps you're taking, what keys you are typing, what output you're getting, and what you expect the output to be.

Comment 27 Jose Vilmar Estacio de Souza 2009-01-27 21:17:07 UTC

(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #23)
> > > Created an attachment (id=127335) [edit]
> > > Patch to resolve speakKeyName pronunciation issue from comment #21
> > > 
> > 
> > The first problem reported in comment 21 is fixed, but the problem related to
> > pidgin is not fixed. 
> > Should I attach a debug file and/or provide more information?
> 
> Redefining msn to messenger works fine for me in pidgin, gedit, terminal, etc. 
> Please provide an exact description of the steps you're taking, what keys you
> are typing, what output you're getting, and what you expect the output to be.
> 

After redefine msn to messenger try the following steps:
1. Go to pidgin.
2. Create a group called msn.
3. Make sure that pidgin is configured to display empty groups.
4. scroll in the buddy list using up and Down key.
Instead of read messenger orca reads msn.

Comment 28 Willie Walker 2009-01-27 21:47:57 UTC

Created attachment 127362 [details] [review]
Patch to address the pidgin problem (I hope)

(In reply to comment #27)
> After redefine msn to messenger try the following steps:
> 1. Go to pidgin.
> 2. Create a group called msn.
> 3. Make sure that pidgin is configured to display empty groups.
> 4. scroll in the buddy list using up and Down key.
> Instead of read messenger orca reads msn.

Many thanks for your testing and these succinct and exact instructions.  Clear test cases definitely help a WHOLE LOT.  I just committed this patch, which touches yet another part of the code.  What we're experience here is that we bubbled the pronunciation handling up a level, so we missed the touch points you are discovering.

Comment 29 Jose Vilmar Estacio de Souza 2009-01-27 22:10:37 UTC

(In reply to comment #28)
> Created an attachment (id=127362) [edit]
> Patch to address the pidgin problem (I hope)
> 
> (In reply to comment #27)
> > After redefine msn to messenger try the following steps:
> > 1. Go to pidgin.
> > 2. Create a group called msn.
> > 3. Make sure that pidgin is configured to display empty groups.
> > 4. scroll in the buddy list using up and Down key.
> > Instead of read messenger orca reads msn.
> 
> Many thanks for your testing and these succinct and exact instructions.  Clear
> test cases definitely help a WHOLE LOT.  I just committed this patch, which
> touches yet another part of the code.  What we're experience here is that we
> bubbled the pronunciation handling up a level, so we missed the touch points
> you are discovering.
> 

Great job!!!
Works fine now.
Many thanks.

Comment 30 Rui Batista 2009-01-28 00:38:16 UTC

(In reply to comment #24)
> (In reply to comment #22)
> > Created an attachment (id=127319) [edit]
> > rev 8, includes Rui's regexp, and rewrites the first 2 regexps.
> 
> Rui - please test.  I have a feeling this is going to be one of those
> special-case-will-never-address-everyones-desires-bugs-from-hell.  But, if we
> can get close, it will be better than not at all.
> 

Works good for me. In most common cases I think these heuristics sufix, doing better requires more then simple regular expression processing I think... 
This is really a very nice feature when programming, thanks John for the idea!

Comment 31 Willie Walker 2009-01-28 18:04:26 UTC

Based upon the comments, I think we're good with this one and can now close it out as fixed.   Phew!  Thanks, and nice work everyone!