After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 746379 - The master and gnome-3-16 branch Orca version doesn't speaks right some already localized simbols with comes from src/orca/chnames.py file
The master and gnome-3-16 branch Orca version doesn't speaks right some alrea...
Status: RESOLVED FIXED
Product: orca
Classification: Applications
Component: speech
3.15.x
Other Linux
: Normal normal
: ---
Assigned To: Orca Maintainers
Orca Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-03-18 06:11 UTC by Hammer Attila
Modified: 2015-04-20 14:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Debug file with possible showing why happening this issue (110.54 KB, text/plain)
2015-03-18 06:11 UTC, Hammer Attila
  Details
Debug file with "black medium small square" text related (217.24 KB, text/plain)
2015-03-24 05:50 UTC, Hammer Attila
  Details
Example simbols file (178 bytes, text/plain)
2015-04-16 12:51 UTC, Hammer Attila
  Details
Testing example HTML document (689 bytes, text/html)
2015-04-17 05:04 UTC, Hammer Attila
  Details
Propose patch (5.47 KB, patch)
2015-04-17 08:09 UTC, Hammer Attila
none Details | Review
Corrected example geometry simbols collection document (178 bytes, text/plain)
2015-04-17 08:11 UTC, Hammer Attila
  Details
Shorter patch with marking translation only the u+25fe unicode character (840 bytes, patch)
2015-04-17 12:18 UTC, Hammer Attila
committed Details | Review

Description Hammer Attila 2015-03-18 06:11:17 UTC
Created attachment 299682 [details]
Debug file with possible showing why happening this issue

Dear Joanie,

With Orca master and gnome-3-16 branch version speaks english language some already hungarian language localized strings, this strings comes from src/orca/chnames.py file.
Usual this strings is various bullet types (for example the white bullet).
Short testcase:
1. Launch Gedit.
2. Press CTRL+SHIFT+U keystroke, type 25e6 number and press space key.
3. Press ENTER key, and press an up arrow key.
Expected result:
When Orca need speaks this simbol, need speaking the proper translated message if the translation is available.
For example with hungarian language this simbol is translated with "üres listajel" string.
This is happening correct with Orca 3.14.3 version, Orca 3.14.3 version producing the expected result.
Actual result:
I hear the english message this simbol, possible other simbols is affected too (black square, white square, etc).

I attaching a debug.out file.

Attila
Comment 1 Joanmarie Diggs (IRC: joanie) 2015-03-18 15:07:42 UTC
I think the problem is due to the fact that math symbols were added to Orca and a few characters fall into the category of being both bullets and math symbols. Have you done the translation of the math symbols yet?
Comment 2 Hammer Attila 2015-03-18 15:29:12 UTC
Unfortunately I not full ready the translation yet, I need reviewing 98 fuzzy translations and 38 untranslated messages.

Attila
Comment 3 Hammer Attila 2015-03-18 17:11:07 UTC
Hi Joanie,

I don't understand a thing for example whit white bullet related string (u+25e6 unicode character):
In Orca hungarian translation file only one string containing this simbol, looks some parts from hu.po file:
#. Translators: this is the spoken word for the character '○' (U+25cb)
#. It can be used as a bullet in a list.
#.
#: ../src/orca/chnames.py:695
msgid "white circle"
msgstr "üres kör"

#. Translators: this is the spoken word for the character '●' (U+25cf)
#. It can be used as a bullet in a list.
#.
#: ../src/orca/chnames.py:700
msgid "black circle"
msgstr "fekete kör"

#. Translators: this is the spoken word for the character '◦' (U+25e6)
#.
#: ../src/orca/chnames.py:704
msgid "white bullet"
msgstr "üres listajel"

With math simbols related the white bullet string is not have the new translatable messages, only one message containing this string.
When a webpage Orca found this simbol, sends the english message to the speech sinth. In Firefox Orca using the new math simbols related code?

Attila
Comment 4 Joanmarie Diggs (IRC: joanie) 2015-03-18 17:20:49 UTC
Aha. Ok, I'm in a meeting now, but I think I know what the problem might be. I'll look soon. Thanks!
Comment 5 Joanmarie Diggs (IRC: joanie) 2015-03-23 18:50:21 UTC
I just pushed a change to master that should solve this. Please verify and if it's all good, I'll commit it to the stable branch.

Thanks! (And sorry!!)
Comment 6 Hammer Attila 2015-03-24 05:48:48 UTC
Hi Joanie,

If I typing in Gedit the 25e1  simbol, I right hear hungarian translation this list type.
But, in Firefox I found a webpage with Orca says bulleted list simbol with black medium small square text (possible this is a different bulleted list type).
Testcase:
1. Launch Firefox.
2. Open following link:
https://wiki.archlinux.org/index.php/locale
3. Goto the navigation heading level, and type some I letters, or move between lines with arrow key.
Both layout mode and non layout mode I hear the "black medium small square" text with hungarian locale.
The Orca 3.16 version hungarian translation containing only 19 untranslated messages, and I doesn't see this simbol text in the translation file. The translation doesn't yet committed into the master branch, because need determining the remaining texts translation messages.
Where coming from this simbol?

Attila
Comment 7 Hammer Attila 2015-03-24 05:50:16 UTC
Created attachment 300178 [details]
Debug file with "black medium small square" text related

I sending a debug.out file with containing this simbol text speaking.

Attila
Comment 8 Hammer Attila 2015-04-16 07:38:40 UTC
Hi Joanie,

Have you an ydea the remaining issues related (black medium small square and other this style simbols)? Orca says this english simbol name if I paste for example in Gedit the 25fe unicode character.
Need marking this simbol text with translation, or possible fixing this type issue both Orca master and gnome-3-16 branch versions?

Attila
Comment 9 Joanmarie Diggs (IRC: joanie) 2015-04-16 11:15:53 UTC
Is the symbol spoken in Hungarian if you use Orca 3.14?
Comment 10 Hammer Attila 2015-04-16 12:34:10 UTC
No, Orca 3.14 not speak this type simbols between 25f1 and 25ff unicode character range.
Some example simbols with the Master and GNOME 3.16 version speaks english and Orca 3.14 version not speaks the 25fx range:
25f1 (unicode character is ◱, Orca says white square with lover left quadrant)
25f2 (same the previous character, only end of simbol text changing to lover right quadrant)
25f3 (unicode character is ◳, Orca says this simbol with white square with upper right quadrant).
25f4: unicode character is ◴, Orca says this simbol with white circle with upper right quadrant.
...
25fa: Unicode character is ◺, Orca speaks this simbol with lover left triangle
25fb: unicode character is ◻, Orca says this simbol with white medium square
25fc: unicode character is ◼, Orca says this simbol with black medium square
25fd: unicode character is ◽, Orca says this simbol with white medium small square
25fe: unicode character is ◾, Orca says for example in Firefox this simbol with black medium small square, see my wiki page related testcase.
25ff: unicode character is ◿, Orca says this simbol with lover right triangle.
Useful this simbols for example in education or math?
I think this simbols is not marked for translation the gnome-3-16 and master branch Orca versions.

Attila
Comment 11 Hammer Attila 2015-04-16 12:51:13 UTC
Created attachment 301728 [details]
Example simbols file

Joanie, this file containing unicode characters with Orca 3.16 and master branch version not speaks localized.
If I remember right this simbols is not marked for translation. If not useful this simbol speaking for education or math, please hide this character range speaking.
If need marking translation this simbols, unfortunately only this is possible with Orca master branch.

Already committed you doed fix with bulleted list simbol related into the gnome-3-16 branch?
I think following commit resolve the 25e6 character related issue:
commit 517bcccbe9d5f3a4d05f5ba7b985589954594821
Author: Joanmarie Diggs <jdiggs@igalia.com>
Date:   Mon Mar 23 14:38:57 2015 -0400

    Move some technical-content symbols to mathsymbols.py and remove duplicates
    
    Because there are far more symbols than we can ask translators to translate,
    we have fallback code so that we at least speak something for the symbols
    which we think are unlikely to be encountered, and likely to not be presented
    by espeak. This range-based code was stomping on a few symbols which were
    actually localized. Moving them to mathsymbols.py will stop the stomping.
    Also removed some duplicates.

Attila
Comment 12 Joanmarie Diggs (IRC: joanie) 2015-04-16 13:04:30 UTC
Thanks for the information!

If in 3.14 Orca doesn't speak them at all, and in 3.16 and later Orca speaks them in English, then there is not a regression in 3.16. Which is a good thing. As for what to do....

As you know, we cannot add new strings to the stable branch without breaking string freeze. I really, really, really do not want to ask for string freeze break. So the choices are to leave 3.16 as-is which means your black medium small square will be spoken in English, or to remove the code falling back on those names which means your black medium small square will not be spoken at all. Which do you think is better? I'm honestly not sure. But I'm leaning towards leaving them in, for two reasons:

1) Knowing there's something there is better than having no idea it's there -- especially if you don't use braille.

2) Keep in mind the reason you're experiencing this in the first place: There are tons and tons of symbols. A large number of them users will never, ever come across. So asking our translators to translate them all "just in case" seems like a bad idea. So how are we to know which ones users will come across in normal use? Well, from what you've reported here, we apparently can find out when users complain that a symbol is being spoken in English, rather than their language.

You reported the problem with bullets. So I think we should mark the bullets you encountered for translation. You listed other symbols based on testing a range, which I appreciate. But have you encountered those characters in documents, or strictly from testing? If the former, then I say let's mark them too; if the latter, I'm less certain. As both a user and a translator, what do you think we should do?
Comment 13 Joanmarie Diggs (IRC: joanie) 2015-04-16 13:05:27 UTC
> Already committed you doed fix with bulleted list simbol related into the
> gnome-3-16 branch?
> I think following commit resolve the 25e6 character related issue:
> commit 517bcccbe9d5f3a4d05f5ba7b985589954594821
> Author: Joanmarie Diggs <jdiggs@igalia.com>
> Date:   Mon Mar 23 14:38:57 2015 -0400

Yes, that was committed and included in the 3.16.1 release I did yesterday.
Comment 14 Hammer Attila 2015-04-16 13:44:29 UTC
This is a difficult thing with you wrote.
How often see and want hear the Orca users this simbol range (25e1 to 25ff range) with average work or education? I newer see this type simbols previous before Orca not begin known math simbols speaking. When I founded the black medium small square simbol unicode character value, I simple tryed the previous character range to determine more simbols is affected too this issue related.
For example, the 25eX range I think only the 25e6 simbol is useful (bulleted list simbol), with already Orca says correct translated, similar the previous Orca version.
Visually I not see what characters or mathematical graphics meaning this simbols with I collected the attached document.

With black square and black medium small square simbols related:
If this simbols leave with english language, for example normal internet browsing with Firefox lot of time marking this simbols with bulleted lists the black medium small square simbol.
Non english language environment very disturbing normal article reading this english language simbol message.

I full agree with you not would like ask string freeze break this simbols related the 3.16 branch. If this simbol range is not useful your openion with education environment or average tasks, please hide this simbol range speaking to the gnome-3-16 branch and master branch Orca versions if you not would like marking translation this simbols into the master branch.
Ofcourse the 25e6 character is an exception, this character marking I think the blank bulleted list simbol, this simbol translation possibility need keeping.

Attila
Comment 15 Joanmarie Diggs (IRC: joanie) 2015-04-16 14:31:55 UTC
(In reply to Hammer Attila from comment #14)
 
> With black square and black medium small square simbols related:
> If this simbols leave with english language, for example normal internet
> browsing with Firefox lot of time marking this simbols with bulleted lists
> the black medium small square simbol.
> Non english language environment very disturbing normal article reading this
> english language simbol message.

Very disturbing is bad. Blek. Can you give me a list of the ones you are seeing constantly and which are being spoken in English because they are not marked for translation?

I'm not 100% opposed to asking for a freeze break; just 97.5%. ;) 

If you can provide me with a short list, we can mark them for translation in master and then ask the i18n team what they think wrt the 3.16 branch. If you agree, mind providing me with a patch for master for just the disturbing ones that you find in Firefox?
Comment 16 Hammer Attila 2015-04-17 05:01:29 UTC
Hi Joanie,

I very often see in Firefox the black medium small square simbol in bulleted lists, other I yesterday testing purpose collected simbols is newer see.
I little looked and tested different list type examples from following page:
http://www.w3schools.com/html/html_lists.asp
All tested lists works right with Orca master and gnome-3-16 branch versions, except following style list:
 <ul style="list-style-type:square">
  <li>Coffee</li>
  <li>Tea</li>
  <li>Milk</li>
</ul>
This style list before each list elements I hear "black medium small square" simbol text.

I mark the yesterday sent character range simbols into proper Orca files for translation the master branch version, or add only the black medium small square simbol only?

Attila
Comment 17 Hammer Attila 2015-04-17 05:04:12 UTC
Created attachment 301774 [details]
Testing example HTML document

This HTML file containing various list type examples, only the square style lists Orca not speaks localized the bulleted list simbol.

Attila
Comment 18 Hammer Attila 2015-04-17 08:09:36 UTC
Created attachment 301795 [details] [review]
Propose patch

Joanie, please review this patch, me works good the patch both Orca master and gnome-3-16 branch versions.
Sure by sure I added missing math simbols from u+25e0 to u+25ff range.
I will attaching an example collected list document with containing this simbols. If I using my local updated hungarian translation with containing this newest simbols translation messages, I hear all simbols right with hungarian translated messages.
The previous attached HTML list example document in Firefox works good too.
If not need adding all simbols with I added this patch, feel free you cut the unneed simbols adding related code.
If I determining right this simbol range, this simbol range presenting geometry forms in math, so possible important this simbol range. Visually I not known how looking the screen this characters.
I don't understand a thing:
How determine Orca the original english name this simbols if not added this simbols in src/orca/mathsymbols.py file?
For example how can known Orca the 25f0 unicode character need sending the "white square with upper left quadrant" text if not have this simbol text in src/orca/chnames.py or the src/orca/mathsymbols.py file?
I founded a very good wiki page with describe lot of math symbols:
http://en.wikipedia.org/wiki/Mathematical_operators_and_symbols_in_Unicode
Possible have another useful simbols this wiki page with need marking for translation, for example the geometric shapes is important with education for a school.
I handle now from 25ex to 25fx range only.
You known previous this wiki page?

Attila
Comment 19 Hammer Attila 2015-04-17 08:11:25 UTC
Created attachment 301796 [details]
Corrected example geometry simbols collection document
Comment 20 Joanmarie Diggs (IRC: joanie) 2015-04-17 11:29:50 UTC
Hi Attila.

Yes, I am aware of that page. And because I want to try to minimize the amount of work we ask our translators to do, I deliberately did not mark all of those symbols for translation. What I did instead is add the core mathematical symbols along with the programmatic solution we have been talking about.

Because I don't want to mark everything in that list for translation, and only mark the things that actual users will actually encounter, I suggested the following in comment 15:

> Very disturbing is bad. Blek. Can you give me a list of the ones you are
> seeing constantly and which are being spoken in English because they are not
> marked for translation?

Are you seeing all the symbols in your patch constantly and being disturbed by them?

I also said:

> If you can provide me with a short list, we can mark them for translation in
> master and then ask the i18n team what they think wrt the 3.16 branch. If
> you agree, mind providing me with a patch for master for just the disturbing
> ones that you find in Firefox?

Even if you think that all of the symbols in your patch must be marked for translation, I am not going to request freeze break for those and you will still be disturbed by square bullets in Firefox. If you would like me to request freeze break so that you are not disturbed, I need a patch will just those disturbing symbols. <smiles>

Make sense?
Comment 21 Hammer Attila 2015-04-17 12:06:07 UTC
Hi Joanie,

With Firefox related only I founded the black medium small square simbol (u+25fe) with disturbing only non english language environment the normal webpage reading if the webpage author using following list style markup:
<ul style="list-style-type:square">
<li>Coffee</li>
<li>milk</li>
</ul>

So, the gnome-3-16 branch only need adding following translation markup code the src/orca/mathsimbols.py file:
# Translators: this is the spoken representation for the character '◾' (U+25fe)
_shapes['\u25fe'] = _("black medium small square")
The i18n team will not angry if we would like asking string freeze break with one translation message related? :-(:-(

Please little wait, I doing a short patch with containing only this simbol.
Other this day morning my patch translation marked geometry simbols is not need adding the master branch? I don't no how often have this simbols for example a secondary school or university mathematics book, but if have bigger chance this, need marking this simbols with translation too the master branch.

Attila
Comment 22 Hammer Attila 2015-04-17 12:18:27 UTC
Created attachment 301822 [details] [review]
Shorter patch with marking translation only the u+25fe unicode character

Joanie, I attaching the shorter patch.
This simbol need adding the gnome-3-16 branch and the master branch if not need adding other this day morning collected geometry simbols to the translatable messages.

Attila
Comment 23 Joanmarie Diggs (IRC: joanie) 2015-04-17 12:43:27 UTC
(In reply to Hammer Attila from comment #21)
> The i18n team will not angry if we would like asking string freeze break
> with one translation message related? :-(:-(

Why would they be angry? Asking them to translate one extra string -- and do so between now and 3.16.2 -- is better than asking for them to translate a bunch of strings between now and 3.16.2. More strings is more work. And string freeze means "no more strings". One string is closer to no strings than a bunch of strings is. <grins>
Comment 24 Joanmarie Diggs (IRC: joanie) 2015-04-17 13:23:12 UTC
Comment on attachment 301822 [details] [review]
Shorter patch with marking translation only the u+25fe unicode character

Thanks!! I have committed this to master and requested freeze break:
https://mail.gnome.org/archives/gnome-i18n/2015-April/msg00056.html
Comment 25 Hammer Attila 2015-04-19 10:06:17 UTC
Hi Joanie,

I think you will be not happy when you reading this comment:
Unfortunatelly I founded now an another simbol when I search youtube video with Firefox builtin google search engine with not marked translation now.
Testcase:
1. Launch Firefox.
2. Press CTRL+K keystroke, and type dog+video+youtube search term. After typing the search term, press ENTER key.
Above after all video search results have following simbol with Orca master and gnome-3-16 branch version says english now:
▶ (u+25b6 unicode simbol)
Usual this simbol me presents before the search resulted video time value, an example string from a search result time value:
"▶ 3:56"

What we doing now? Only we add this simbol into master and gnome-3-16 branch when we get approve with string freeze break, or we add popular geometry simbols with missing now?
I don't no how many simbols possible we founding with normal internet browsing with need marking translation, but the hunting this simbols is very difficult. :-(:-(
If I not need searching a video in youtube now, we not founding this testcase with u+25f6 simbol related.

Attila
Comment 26 Joanmarie Diggs (IRC: joanie) 2015-04-19 17:00:28 UTC
I have just committed a change to both master and the gnome-3-16 branch changing the default value of fallbackOnUnicodeData to false. This should stop Orca from presenting symbols in English, but will still allow users who would prefer to hear English than absolutely nothing to accomplish this via their orca-customizations.py file. Problem hopefully solved. Please confirm that you are no longer having this problem. Thanks.
Comment 27 Hammer Attila 2015-04-20 05:20:49 UTC
Joanie, absolute solved this issue, I repeated with Youtube video search related testcase after I recompiled, reinstall and restart Orca master branch version.
Now I not hear the u+25b6 simbol before the video time value, so this issue is hidden with average users the master and gnome-3-16 branch Orca versions.
What the next step? We close this bug with resolved, fixed state and open a new report with popular geometry simbols related with concentrating only the master branch?
Have adwantage this, master branch not have string freeze now, and we have more time to collect this simbols.
If need any work your openion the popular missing math geometry simbols related, need adding awerage math geometry simbols with not marked now with translation.

Attila
Comment 28 Joanmarie Diggs (IRC: joanie) 2015-04-20 14:33:40 UTC
> Now I not hear the u+25b6 simbol before the video time value, so this issue
> is hidden with average users the master and gnome-3-16 branch Orca versions.

Yay! Thanks for testing.

> What the next step? We close this bug with resolved, fixed state and open a
> new report with popular geometry simbols related with concentrating only the
> master branch?

To be honest, the way I'm feeling currently is that we're done with this issue for now. The current situation is that your experience on web pages is, I hope, no longer being disturbed. There is already code in place whereby someone reading technical documents can enable the fallback code a single line in their orca-customizations.py. So let's let users who read technical content read that content. If they say "Orca is silent for this symbol," they can enable the fallback code as a temporary fix, and we can mark those strings for translation.

If you agree that this is a reasonable way forward, please close this bug as fixed. And thank you for your feedback and reports and ideas on this matter!
Comment 29 Hammer Attila 2015-04-20 14:38:55 UTC
Hi Joanie,

Absolute good this method handling way this type issues related.
I closed this bug with resolved, fixed state.

Attila