GNOME Bugzilla – Bug 746379
The master and gnome-3-16 branch Orca version doesn't speaks right some already localized simbols with comes from src/orca/chnames.py file
Last modified: 2015-04-20 14:38:55 UTC
Created attachment 299682 [details] Debug file with possible showing why happening this issue Dear Joanie, With Orca master and gnome-3-16 branch version speaks english language some already hungarian language localized strings, this strings comes from src/orca/chnames.py file. Usual this strings is various bullet types (for example the white bullet). Short testcase: 1. Launch Gedit. 2. Press CTRL+SHIFT+U keystroke, type 25e6 number and press space key. 3. Press ENTER key, and press an up arrow key. Expected result: When Orca need speaks this simbol, need speaking the proper translated message if the translation is available. For example with hungarian language this simbol is translated with "üres listajel" string. This is happening correct with Orca 3.14.3 version, Orca 3.14.3 version producing the expected result. Actual result: I hear the english message this simbol, possible other simbols is affected too (black square, white square, etc). I attaching a debug.out file. Attila
I think the problem is due to the fact that math symbols were added to Orca and a few characters fall into the category of being both bullets and math symbols. Have you done the translation of the math symbols yet?
Unfortunately I not full ready the translation yet, I need reviewing 98 fuzzy translations and 38 untranslated messages. Attila
Hi Joanie, I don't understand a thing for example whit white bullet related string (u+25e6 unicode character): In Orca hungarian translation file only one string containing this simbol, looks some parts from hu.po file: #. Translators: this is the spoken word for the character '○' (U+25cb) #. It can be used as a bullet in a list. #. #: ../src/orca/chnames.py:695 msgid "white circle" msgstr "üres kör" #. Translators: this is the spoken word for the character '●' (U+25cf) #. It can be used as a bullet in a list. #. #: ../src/orca/chnames.py:700 msgid "black circle" msgstr "fekete kör" #. Translators: this is the spoken word for the character '◦' (U+25e6) #. #: ../src/orca/chnames.py:704 msgid "white bullet" msgstr "üres listajel" With math simbols related the white bullet string is not have the new translatable messages, only one message containing this string. When a webpage Orca found this simbol, sends the english message to the speech sinth. In Firefox Orca using the new math simbols related code? Attila
Aha. Ok, I'm in a meeting now, but I think I know what the problem might be. I'll look soon. Thanks!
I just pushed a change to master that should solve this. Please verify and if it's all good, I'll commit it to the stable branch. Thanks! (And sorry!!)
Hi Joanie, If I typing in Gedit the 25e1 simbol, I right hear hungarian translation this list type. But, in Firefox I found a webpage with Orca says bulleted list simbol with black medium small square text (possible this is a different bulleted list type). Testcase: 1. Launch Firefox. 2. Open following link: https://wiki.archlinux.org/index.php/locale 3. Goto the navigation heading level, and type some I letters, or move between lines with arrow key. Both layout mode and non layout mode I hear the "black medium small square" text with hungarian locale. The Orca 3.16 version hungarian translation containing only 19 untranslated messages, and I doesn't see this simbol text in the translation file. The translation doesn't yet committed into the master branch, because need determining the remaining texts translation messages. Where coming from this simbol? Attila
Created attachment 300178 [details] Debug file with "black medium small square" text related I sending a debug.out file with containing this simbol text speaking. Attila
Hi Joanie, Have you an ydea the remaining issues related (black medium small square and other this style simbols)? Orca says this english simbol name if I paste for example in Gedit the 25fe unicode character. Need marking this simbol text with translation, or possible fixing this type issue both Orca master and gnome-3-16 branch versions? Attila
Is the symbol spoken in Hungarian if you use Orca 3.14?
No, Orca 3.14 not speak this type simbols between 25f1 and 25ff unicode character range. Some example simbols with the Master and GNOME 3.16 version speaks english and Orca 3.14 version not speaks the 25fx range: 25f1 (unicode character is ◱, Orca says white square with lover left quadrant) 25f2 (same the previous character, only end of simbol text changing to lover right quadrant) 25f3 (unicode character is ◳, Orca says this simbol with white square with upper right quadrant). 25f4: unicode character is ◴, Orca says this simbol with white circle with upper right quadrant. ... 25fa: Unicode character is ◺, Orca speaks this simbol with lover left triangle 25fb: unicode character is ◻, Orca says this simbol with white medium square 25fc: unicode character is ◼, Orca says this simbol with black medium square 25fd: unicode character is ◽, Orca says this simbol with white medium small square 25fe: unicode character is ◾, Orca says for example in Firefox this simbol with black medium small square, see my wiki page related testcase. 25ff: unicode character is ◿, Orca says this simbol with lover right triangle. Useful this simbols for example in education or math? I think this simbols is not marked for translation the gnome-3-16 and master branch Orca versions. Attila
Created attachment 301728 [details] Example simbols file Joanie, this file containing unicode characters with Orca 3.16 and master branch version not speaks localized. If I remember right this simbols is not marked for translation. If not useful this simbol speaking for education or math, please hide this character range speaking. If need marking translation this simbols, unfortunately only this is possible with Orca master branch. Already committed you doed fix with bulleted list simbol related into the gnome-3-16 branch? I think following commit resolve the 25e6 character related issue: commit 517bcccbe9d5f3a4d05f5ba7b985589954594821 Author: Joanmarie Diggs <jdiggs@igalia.com> Date: Mon Mar 23 14:38:57 2015 -0400 Move some technical-content symbols to mathsymbols.py and remove duplicates Because there are far more symbols than we can ask translators to translate, we have fallback code so that we at least speak something for the symbols which we think are unlikely to be encountered, and likely to not be presented by espeak. This range-based code was stomping on a few symbols which were actually localized. Moving them to mathsymbols.py will stop the stomping. Also removed some duplicates. Attila
Thanks for the information! If in 3.14 Orca doesn't speak them at all, and in 3.16 and later Orca speaks them in English, then there is not a regression in 3.16. Which is a good thing. As for what to do.... As you know, we cannot add new strings to the stable branch without breaking string freeze. I really, really, really do not want to ask for string freeze break. So the choices are to leave 3.16 as-is which means your black medium small square will be spoken in English, or to remove the code falling back on those names which means your black medium small square will not be spoken at all. Which do you think is better? I'm honestly not sure. But I'm leaning towards leaving them in, for two reasons: 1) Knowing there's something there is better than having no idea it's there -- especially if you don't use braille. 2) Keep in mind the reason you're experiencing this in the first place: There are tons and tons of symbols. A large number of them users will never, ever come across. So asking our translators to translate them all "just in case" seems like a bad idea. So how are we to know which ones users will come across in normal use? Well, from what you've reported here, we apparently can find out when users complain that a symbol is being spoken in English, rather than their language. You reported the problem with bullets. So I think we should mark the bullets you encountered for translation. You listed other symbols based on testing a range, which I appreciate. But have you encountered those characters in documents, or strictly from testing? If the former, then I say let's mark them too; if the latter, I'm less certain. As both a user and a translator, what do you think we should do?
> Already committed you doed fix with bulleted list simbol related into the > gnome-3-16 branch? > I think following commit resolve the 25e6 character related issue: > commit 517bcccbe9d5f3a4d05f5ba7b985589954594821 > Author: Joanmarie Diggs <jdiggs@igalia.com> > Date: Mon Mar 23 14:38:57 2015 -0400 Yes, that was committed and included in the 3.16.1 release I did yesterday.
This is a difficult thing with you wrote. How often see and want hear the Orca users this simbol range (25e1 to 25ff range) with average work or education? I newer see this type simbols previous before Orca not begin known math simbols speaking. When I founded the black medium small square simbol unicode character value, I simple tryed the previous character range to determine more simbols is affected too this issue related. For example, the 25eX range I think only the 25e6 simbol is useful (bulleted list simbol), with already Orca says correct translated, similar the previous Orca version. Visually I not see what characters or mathematical graphics meaning this simbols with I collected the attached document. With black square and black medium small square simbols related: If this simbols leave with english language, for example normal internet browsing with Firefox lot of time marking this simbols with bulleted lists the black medium small square simbol. Non english language environment very disturbing normal article reading this english language simbol message. I full agree with you not would like ask string freeze break this simbols related the 3.16 branch. If this simbol range is not useful your openion with education environment or average tasks, please hide this simbol range speaking to the gnome-3-16 branch and master branch Orca versions if you not would like marking translation this simbols into the master branch. Ofcourse the 25e6 character is an exception, this character marking I think the blank bulleted list simbol, this simbol translation possibility need keeping. Attila
(In reply to Hammer Attila from comment #14) > With black square and black medium small square simbols related: > If this simbols leave with english language, for example normal internet > browsing with Firefox lot of time marking this simbols with bulleted lists > the black medium small square simbol. > Non english language environment very disturbing normal article reading this > english language simbol message. Very disturbing is bad. Blek. Can you give me a list of the ones you are seeing constantly and which are being spoken in English because they are not marked for translation? I'm not 100% opposed to asking for a freeze break; just 97.5%. ;) If you can provide me with a short list, we can mark them for translation in master and then ask the i18n team what they think wrt the 3.16 branch. If you agree, mind providing me with a patch for master for just the disturbing ones that you find in Firefox?
Hi Joanie, I very often see in Firefox the black medium small square simbol in bulleted lists, other I yesterday testing purpose collected simbols is newer see. I little looked and tested different list type examples from following page: http://www.w3schools.com/html/html_lists.asp All tested lists works right with Orca master and gnome-3-16 branch versions, except following style list: <ul style="list-style-type:square"> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ul> This style list before each list elements I hear "black medium small square" simbol text. I mark the yesterday sent character range simbols into proper Orca files for translation the master branch version, or add only the black medium small square simbol only? Attila
Created attachment 301774 [details] Testing example HTML document This HTML file containing various list type examples, only the square style lists Orca not speaks localized the bulleted list simbol. Attila
Created attachment 301795 [details] [review] Propose patch Joanie, please review this patch, me works good the patch both Orca master and gnome-3-16 branch versions. Sure by sure I added missing math simbols from u+25e0 to u+25ff range. I will attaching an example collected list document with containing this simbols. If I using my local updated hungarian translation with containing this newest simbols translation messages, I hear all simbols right with hungarian translated messages. The previous attached HTML list example document in Firefox works good too. If not need adding all simbols with I added this patch, feel free you cut the unneed simbols adding related code. If I determining right this simbol range, this simbol range presenting geometry forms in math, so possible important this simbol range. Visually I not known how looking the screen this characters. I don't understand a thing: How determine Orca the original english name this simbols if not added this simbols in src/orca/mathsymbols.py file? For example how can known Orca the 25f0 unicode character need sending the "white square with upper left quadrant" text if not have this simbol text in src/orca/chnames.py or the src/orca/mathsymbols.py file? I founded a very good wiki page with describe lot of math symbols: http://en.wikipedia.org/wiki/Mathematical_operators_and_symbols_in_Unicode Possible have another useful simbols this wiki page with need marking for translation, for example the geometric shapes is important with education for a school. I handle now from 25ex to 25fx range only. You known previous this wiki page? Attila
Created attachment 301796 [details] Corrected example geometry simbols collection document
Hi Attila. Yes, I am aware of that page. And because I want to try to minimize the amount of work we ask our translators to do, I deliberately did not mark all of those symbols for translation. What I did instead is add the core mathematical symbols along with the programmatic solution we have been talking about. Because I don't want to mark everything in that list for translation, and only mark the things that actual users will actually encounter, I suggested the following in comment 15: > Very disturbing is bad. Blek. Can you give me a list of the ones you are > seeing constantly and which are being spoken in English because they are not > marked for translation? Are you seeing all the symbols in your patch constantly and being disturbed by them? I also said: > If you can provide me with a short list, we can mark them for translation in > master and then ask the i18n team what they think wrt the 3.16 branch. If > you agree, mind providing me with a patch for master for just the disturbing > ones that you find in Firefox? Even if you think that all of the symbols in your patch must be marked for translation, I am not going to request freeze break for those and you will still be disturbed by square bullets in Firefox. If you would like me to request freeze break so that you are not disturbed, I need a patch will just those disturbing symbols. <smiles> Make sense?
Hi Joanie, With Firefox related only I founded the black medium small square simbol (u+25fe) with disturbing only non english language environment the normal webpage reading if the webpage author using following list style markup: <ul style="list-style-type:square"> <li>Coffee</li> <li>milk</li> </ul> So, the gnome-3-16 branch only need adding following translation markup code the src/orca/mathsimbols.py file: # Translators: this is the spoken representation for the character '◾' (U+25fe) _shapes['\u25fe'] = _("black medium small square") The i18n team will not angry if we would like asking string freeze break with one translation message related? :-(:-( Please little wait, I doing a short patch with containing only this simbol. Other this day morning my patch translation marked geometry simbols is not need adding the master branch? I don't no how often have this simbols for example a secondary school or university mathematics book, but if have bigger chance this, need marking this simbols with translation too the master branch. Attila
Created attachment 301822 [details] [review] Shorter patch with marking translation only the u+25fe unicode character Joanie, I attaching the shorter patch. This simbol need adding the gnome-3-16 branch and the master branch if not need adding other this day morning collected geometry simbols to the translatable messages. Attila
(In reply to Hammer Attila from comment #21) > The i18n team will not angry if we would like asking string freeze break > with one translation message related? :-(:-( Why would they be angry? Asking them to translate one extra string -- and do so between now and 3.16.2 -- is better than asking for them to translate a bunch of strings between now and 3.16.2. More strings is more work. And string freeze means "no more strings". One string is closer to no strings than a bunch of strings is. <grins>
Comment on attachment 301822 [details] [review] Shorter patch with marking translation only the u+25fe unicode character Thanks!! I have committed this to master and requested freeze break: https://mail.gnome.org/archives/gnome-i18n/2015-April/msg00056.html
Hi Joanie, I think you will be not happy when you reading this comment: Unfortunatelly I founded now an another simbol when I search youtube video with Firefox builtin google search engine with not marked translation now. Testcase: 1. Launch Firefox. 2. Press CTRL+K keystroke, and type dog+video+youtube search term. After typing the search term, press ENTER key. Above after all video search results have following simbol with Orca master and gnome-3-16 branch version says english now: ▶ (u+25b6 unicode simbol) Usual this simbol me presents before the search resulted video time value, an example string from a search result time value: "▶ 3:56" What we doing now? Only we add this simbol into master and gnome-3-16 branch when we get approve with string freeze break, or we add popular geometry simbols with missing now? I don't no how many simbols possible we founding with normal internet browsing with need marking translation, but the hunting this simbols is very difficult. :-(:-( If I not need searching a video in youtube now, we not founding this testcase with u+25f6 simbol related. Attila
I have just committed a change to both master and the gnome-3-16 branch changing the default value of fallbackOnUnicodeData to false. This should stop Orca from presenting symbols in English, but will still allow users who would prefer to hear English than absolutely nothing to accomplish this via their orca-customizations.py file. Problem hopefully solved. Please confirm that you are no longer having this problem. Thanks.
Joanie, absolute solved this issue, I repeated with Youtube video search related testcase after I recompiled, reinstall and restart Orca master branch version. Now I not hear the u+25b6 simbol before the video time value, so this issue is hidden with average users the master and gnome-3-16 branch Orca versions. What the next step? We close this bug with resolved, fixed state and open a new report with popular geometry simbols related with concentrating only the master branch? Have adwantage this, master branch not have string freeze now, and we have more time to collect this simbols. If need any work your openion the popular missing math geometry simbols related, need adding awerage math geometry simbols with not marked now with translation. Attila
> Now I not hear the u+25b6 simbol before the video time value, so this issue > is hidden with average users the master and gnome-3-16 branch Orca versions. Yay! Thanks for testing. > What the next step? We close this bug with resolved, fixed state and open a > new report with popular geometry simbols related with concentrating only the > master branch? To be honest, the way I'm feeling currently is that we're done with this issue for now. The current situation is that your experience on web pages is, I hope, no longer being disturbed. There is already code in place whereby someone reading technical documents can enable the fallback code a single line in their orca-customizations.py. So let's let users who read technical content read that content. If they say "Orca is silent for this symbol," they can enable the fallback code as a temporary fix, and we can mark those strings for translation. If you agree that this is a reasonable way forward, please close this bug as fixed. And thank you for your feedback and reports and ideas on this matter!
Hi Joanie, Absolute good this method handling way this type issues related. I closed this bug with resolved, fixed state. Attila