After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 618338 - If the break speech into chunks between pauses checkbox is checked, some time I hear unneed dot character spokening between informations
If the break speech into chunks between pauses checkbox is checked, some time...
Status: RESOLVED DUPLICATE of bug 591709
Product: orca
Classification: Applications
Component: speech
2.30.x
Other Linux
: Normal normal
: ---
Assigned To: Orca Maintainers
Orca Maintainers
Depends on:
Blocks:
 
 
Reported: 2010-05-11 06:03 UTC by Hammer Attila
Modified: 2010-06-07 14:33 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Hammer Attila 2010-05-11 06:03:31 UTC
Dear Developers,

Possible this problem is only hungarian Espeak speech sinthesizer specific, but I am not full sure this is true or not. If the break speech into chunks between pauses check box is unchecked, some time I hear unneed dot character spokening between informations. If I using hungarian locale, the triwial example if a menu item containing three dot character, and a menu item have a mnemonic key. If this check box is unchecked, I hear four dot character between the menu item name and the mnemonic key spokening.
Look for example the difference with speech generation result, first following the checked break speech into chunks between pauses check box resulted speech generation result:
           obj             = Leállítás…
           role            = menu item
           alreadyFocused  = False
           utterances:
               (Aktiválásához nyomja meg a szóköz billentyűt.)
GENERATOR: getTutorial
           obj             = Leállítás…
           role            = menu item
           alreadyFocused  = False
           utterances:
               (Aktiválásához nyomja meg a szóköz billentyűt.)
tutorial=['Aktiv\xc3\xa1l\xc3\xa1s\xc3\xa1hoz nyomja meg a sz\xc3\xb3k\xc3\xb6z billenty\xc5\xb1t.']
generate speech results:
  Leállítás…
  <orca.speech_generator.Pause instance at 0x154bab8>
  l
  <orca.speech_generator.LineBreak instance at 0x154bb48>
  <orca.speech_generator.Pause instance at 0x154bab8>
  8 per 8.
  <orca.speech_generator.Pause instance at 0x154bab8>
  Aktiválásához nyomja meg a szóköz billentyűt.
SPEECH OUTPUT: 'Leállítás….'
SPEECH OUTPUT: 'l'
SPEECH OUTPUT: '8 per 8.'
SPEECH OUTPUT: 'Aktiválásához nyomja meg a szóköz billentyűt.'

Now, here the unchecked speech generation result if the break speech into chunks between pauses check box:
           obj             = Leállítás…
           role            = menu item
           alreadyFocused  = False
           utterances:
               (Aktiválásához nyomja meg a szóköz billentyűt.)
GENERATOR: getTutorial
           obj             = Leállítás…
           role            = menu item
           alreadyFocused  = False
           utterances:
               (Aktiválásához nyomja meg a szóköz billentyűt.)
tutorial=['Aktiv\xc3\xa1l\xc3\xa1s\xc3\xa1hoz nyomja meg a sz\xc3\xb3k\xc3\xb6z billenty\xc5\xb1t.']
generate speech results:
  Leállítás…
  <orca.speech_generator.Pause instance at 0x1281ab8>
  l
  <orca.speech_generator.LineBreak instance at 0x1281b48>
  <orca.speech_generator.Pause instance at 0x1281ab8>
  8 per 8.
  <orca.speech_generator.Pause instance at 0x1281ab8>
  Aktiválásához nyomja meg a szóköz billentyűt.
SPEECH OUTPUT: 'Leállítás…. l'
SPEECH OUTPUT: '8 per 8. Aktiválásához nyomja meg a szóköz billentyűt.'

If Espeak speech sinthesizer see a string after a dot and lower case letter, spokening four dot character because impossible to determining this is a verb ornot, for example the test.test word need spokening the dot character. Possible need insert a line break between the object name and next information with src/orca/formatting.py file with this situation, or remove the unneed dot character after the … character. For example, I not see this problem if I using english locale, I see this problem only if I using hungarian locale.

I see this problem both Orca 2.30.1 and Orca 2.31.1-pre versions, and Espeak actual stable and 1.43.23 test versions.

Attila
Comment 1 Hammer Attila 2010-05-11 06:26:13 UTC
I wroted following letter with Jonathan Duddington Espeak developer, I wayt he answer, prewious I copyed he this bugreport link, the importanter part:
"When the bugreport wroted check box is unchecked in Orca, Orca does'nt use line breaks between importanter informations. For example following string producing this bugy work in Espeak if the used language is hungarian:
Leállítás…. l

If I replace the … string with normal three dot character, this string are right spokening:
Leállítás.... l
So, have four dot character before the l letter, but if the string containing four ... character, not … character, I hear right three dot character spokening between the "leállítás" string and "l" letter.
Possible you fix this? Or possible I fix this in hu_list file? The purpose need only three ... character are spokened if a …. character string is see with Espeak, undependent how many dot characters have before the next lowercase part."
Sent Jonathan the Espeak specific phoneme debug informations, now need wayt.
Comment 2 Hammer Attila 2010-05-11 06:47:05 UTC
I think need separate the ... related issue and an another missing line break related problem in src/orca/formatting.py code with importanter informations if the wroted check box is checked.
The first wroted .... related issue I think Espeak hungarian language rule specific problem, but the following second problem is not:
For example if I would like filling a new bug in bugzilla with Orca and I jump the component list, Orca spokening following similar hearest informations:
"component.braille"
I will be look what speech output are sent with this example, and copying next comment the generated speech result.
Comment 3 Hammer Attila 2010-05-11 07:04:18 UTC
For example, with prewious wroted bugzilla example the final sent speech result is following:
SPEECH OUTPUT: 'Component:. braille. List with 5 items. Use up and down to select an item.'
Why Orca sent the "Component:." string part?

I will be try analyze the src/orca/formatting.py file with unchecked break speech into chunks between pauses formatting code parts, and if this is good solution, I try inserting line breaks with important places if this is not resulting wrong changes with another speech sinthesizer users. I only use Espeak with Speech-dispatcher, and don't sure this change is good with another speech sinthesizer users, please give hints with proper fix. The final purpose don't need the unneed dot characters if this dot character spokening are not need, for example in this bugzilla list example. The "component:" part in this list the list label, and the "braille" is the possible choose component list item with Orca in bugzilla. This second problem is language undependent I think, because in Espeak if I use hungarian language and sent this detected string format, espeak spokening "componentpont braille" string, but if I test this string with Espeak english language, this string spokening are modifyed, I hear for example the "component colondot braille" string. So, need we found an Orca related fix with this second problem if this is possible.

Attila
Comment 4 Joanmarie Diggs (IRC: joanie) 2010-05-11 07:06:39 UTC
Component is the label for the combo box.
Comment 5 Mesar Hameed 2010-05-12 10:51:02 UTC
Attila,

I am sorry, i am very confused.

You mean we should not have ". " to seperate chunks?

or do you mean we should have it, but when there are 3 already we shouldnt add an extra one?

Thanks

-Jon
Comment 6 Hammer Attila 2010-05-12 11:46:16 UTC
Jon, I don't no what part Orca code to put this unneed dot characters with speech output, for example the bugzilla component list label and between the braille value.
The visible line is following I think, but I am not sure:
Component: braille
Orca speech generation line is following:
Component:. braille

If I known right, if the break speech into chunks between pauses checkbox is checked, following to src/orca/formatting.py line the formatting values:
    formatting['speech'][pyatspi.ROLE_LIST_ITEM]['unfocused'] = \
        'labelAndName + allTextSelection + pause + expandableState + pause + availability + positionInList'
    formatting['speech'][pyatspi.ROLE_LIST_ITEM]['basicWhereAmI'] = \
        'label + roleName + pause + name + pause + positionInList + pause + expandableState + (nodeLevel or nestingLevel) + pause'

If the check box is unchecked, following formatting values are used:
        pyatspi.ROLE_LIST_ITEM: {
            'focused': 'expandableState + availability',
            'unfocused': 'labelAndName + allTextSelection + expandableState + availability + positionInList',
            'basicWhereAmI': 'label + roleName + name + positionInList + expandableState + (nodeLevel or nestingLevel)'
Comment 7 Hammer Attila 2010-05-12 11:50:09 UTC
Following settings toggling this pause related settings with src/orca/settings.py:
# This is for bug #585417 - Allow pauses to be inserted into speech
# output. We're keeping it separate for now until we get the pauses
# sorted out just right.
#
useExperimentalSpeechProsody = True

# If True, whenever a 'pause' keyword is found in a speech formatting
# string, any string being created will be sent to the speech synthesis
# system immediately.  This is for bug #585417 and allows for some
# adaptation to how different systems handle queued speech.  For example,
# some introduce unnaturally long pauses between requests to speak.
#
enablePauseBreaks = True

Very old time I worked the src/orca/formatting.py file, but I try found a way how can possible cut this unneed dot characters if the break speech into chunks between pauses checkbox is unchecked.

Attila
Comment 8 Hammer Attila 2010-05-14 10:43:36 UTC
I have good news.
Jonathan Duddington in Espeak 1.43.26 test version fixed the four heared dot problem in hungarian language if a menu is a dialog menu.
The problem happening because in hungary we using following markup string with dialog markup:
Shutdown… u
In Espeak rule the … character translated with pontpontpont string (similar with english dotdotdot string the three ... character).
Orca sents following speech output with Espeak, but Espeak prewious versions does'nt handle this unwanted situation:
Shutdown…. u
Espeak prewious versions hungarian rule spokening the dotdotdotdot string with this situation, not only three dot character, undependent have more dot after three dot. Now this is fixed.

But have another interesting problem with I can not detect why happening with Orca part, and I not found a fix method in src/orca/formatting.py file, possible I search wrong:
For example, if I navigate the report bugzilla page with the component list, Orca first send following output part in Espeak if the break speech into chunks between pauses checkbox is checked with Orca Bugzilla component list:
"Component:. braille"
In hungarian Espeak language, I hear following this string:
Componentpont braille
Why send Orca a dot character after the colon punctuation character in this situation with generated speech output?
If I send this wrong string direct with Espeak without the --punct Espeak option, I hear similar bugy string. I am not sure this problem need fix in Espeak, or need fixing Orca level?
If I try test this wrong string with Espeak english language, and not use the --punct switch, I hear similar string:
Component colondot braille

Attila
Comment 9 Mesar Hameed 2010-05-14 11:16:21 UTC
Hi Attila,

The additional punctuation was added due to one of your RFE's

#585417

Where we add ". "

Are you saying we should undo this?

Thanks
Comment 10 Hammer Attila 2010-05-14 12:42:42 UTC
If possible No, because the pauses some time good with another Speech sinthesis drivers users, for example the Gnome-speech speech system with Espeak gnome-speech driver, but I think gnome-speech driver is dropped out with GNOME-3.0.
Since I using Orca with Speech-dispatcher and Espeak, I need uncheck break speech into chunks between pauses check box, because I see some stability related problem if this checkbox is checked, but this related an another bugs.
Only need found how can possible handle the :. related string send speech generation if this checkbox is unchecked. I try insert LineBreak with src/orca/formatting.py with following place, but not help:
In         pyatspi.ROLE_LIST_ITEM: {
Original line is following:
            'unfocused': 'labelAndName + allTextSelection + expandableState + availability + positionInList',
Tryed variations, but nothing change:
   1.          'unfocused': 'labelAndName + lineBreak + allTextSelection + expandableState + availability + positionInList',
2.             'unfocused': 'label + Name + lineBreak + allTextSelection + expandableState + availability + positionInList',

I think following type speech output send is possible good, because for example I open a Gedit text editor, and separate the two part output with a line break, not producing Espeak this unneed dot problem:
Component:.
braille
If Orca will be send this format the list first speech output when Jump the caret the list, not disturb Espeak, and I not hear the unneed dot character if I read this two line in Gedit for example the sayall function (I not hear the unneed dot character).

I try test what happening in this situation if I check temporary this checkbox. I look what format speech output are generated and send in Espeak if this check box is checked.

Attila
Comment 11 Hammer Attila 2010-05-14 13:13:21 UTC
See the generated speech result if the break speech into chunks between pauses check box is checked, my prewious wroted good test string format in Gedit absolute confirmed, the heared result is fine:
generate speech results:
  Component:
  <orca.speech_generator.Pause instance at 0x2b6dc20>
  braille
  <orca.speech_generator.Pause instance at 0x2b6dc20>
  5 elemű lista
  <orca.speech_generator.Pause instance at 0x2b6dc20>
  <orca.speech_generator.Pause instance at 0x2b6dc20>
  Egy elem kijelöléséhez használja a fel- és le nyíl billentyűket.
SPEECH OUTPUT: 'Component:.'
SPEECH OUTPUT: 'braille.'
SPEECH OUTPUT: '5 elemű lista.'
SPEECH OUTPUT: 'Egy elem kijelöléséhez használja a fel- és le nyíl billentyűket.'
So, the :. character simbol are present, but not disturb Espeak speech sinthesizer, because Orca sent output string with separate speech line outputs with critical places if this checkbox is checkend and a pause word are inserted in src/orca/formatting.py file the proper objects (already done since 2.28 version).
In src/orca/formatting.py file, this is the enabled break speech into chunks between pauses sent formatting output code in lists if I remember right:
    formatting['speech'][pyatspi.ROLE_LIST_ITEM]['unfocused'] = \
        'labelAndName + allTextSelection + pause + expandableState + pause + availability + positionInList'
    formatting['speech'][pyatspi.ROLE_LIST_ITEM]['basicWhereAmI'] = \
        'label + roleName + pause + name + pause + positionInList + pause + expandableState + (nodeLevel or nestingLevel) + pause'
This formatting directives are sent Orca I think if the following settings are enabled:
# This is for bug #585417 - Allow pauses to be inserted into speech
# output. We're keeping it separate for now until we get the pauses
# sorted out just right.
#
useExperimentalSpeechProsody = True

# If True, whenever a 'pause' keyword is found in a speech formatting
# string, any string being created will be sent to the speech synthesis
# system immediately.  This is for bug #585417 and allows for some
# adaptation to how different systems handle queued speech.  For example,
# some introduce unnaturally long pauses between requests to speak.
#
enablePauseBreaks = True

So, don't need undo this pause toggle, because resulting very good speech with Espeak and break little the speech the need places between important informations.
Only the final question: How can possible separate for example the list view sent outputs with separate lines after the :. character, if need uncheck the pause related check box? Or this is impossible because Orca not use pause directive if this settings are not awailabled? I not remember now what the actual rule this.
If possible fix this problem if Orca send a :. character with any object but insert a line break, I think we solving this bug with Orca part if this check box is unchecked, we not need fix another parts, because checked format output send are working right. This problem are present for example radio buttons with label containing a colon character, combo boxes with label containing a colon character, etc. I think this is a minimal way fix, but not disturb another users because not put lot of pauses if the users not want this. Possible do this?

Attila


Attila
Comment 12 Mesar Hameed 2010-06-07 14:33:25 UTC
Hi Attila,

I am marking this as a duplicate of bug #591709
The issue is that bug #585417 added in a full stop between segments of information that we are building up in orca.
That was a hack to give us extra pauses between the bits of information.
I dont believe adding new line characters will help, because it will move the problem to somewhere else.
We might be speaking "new line" where there isnt one.

I have emailed Trev (working on opentts), to see if he can find a solution for this on his end, there should be some markup that we can include that tells opentts/sd to give us a pause.

Thanks for your hard work.

-Jon

*** This bug has been marked as a duplicate of bug 591709 ***