After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 741473 - Orca does not read correctly some Unicode mathematical characters
Orca does not read correctly some Unicode mathematical characters
Status: RESOLVED FIXED
Product: orca
Classification: Applications
Component: speech
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Joanmarie Diggs (IRC: joanie)
Orca Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-12-13 10:13 UTC by Frédéric Wang
Modified: 2015-01-10 15:52 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Frédéric Wang 2014-12-13 10:13:23 UTC
I was just testing the new SeaMonkey's symbol panel with Orca and many mathematical characters are not read correctly (the Unicode code point is read instead):

https://bugzilla.mozilla.org/attachment.cgi?id=8522022&action=diff#a/editor/ui/dialogs/content/EdInsertMath.js_sec2

Below are more lists of math characters (that may overlap). If you can't see the glyphs on your system, try installing the STIX fonts.

MathML operator dictionary:
http://www.w3.org/TR/MathML3/appendixc.html#oper-dict.entries-table

Mathematical Alphanumeric Symbols (for mathvariant support):
https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols
https://en.wikipedia.org/wiki/Arabic_Mathematical_Alphabetic_Symbols

Unicode ranges for Scientific Documents:
http://www.w3.org/TR/xml-entity-names/#blocks

(note: I don't know if this missing support comes from Orca or espeak)
Comment 1 Frédéric Wang 2014-12-15 11:03:07 UTC
I just opened a bug for espeak: https://sourceforge.net/p/espeak/bugs/121/
Comment 2 Joanmarie Diggs (IRC: joanie) 2015-01-09 05:50:54 UTC
I agree with Jonathan regarding the thousands of characters and at least the possibility that many of those are pretty obscure. But I also believe we need a MUCH better story wrt supporting math content. So.... I've so far added the mathematical operators block to Orca along with parsing of all the non-Arabic alphanumeric characters. [1]

With respect to the latter, the solution was to look up the equivalent letter-like symbol (espeak seems to know most of those and we can fall back on speech-dispatcher to present capitalization according to the user's preference) and use string substitution to combine the equivalent symbol with the style. This approach cuts down greatly on the number of strings we're asking GNOME's (almost 100% volunteer!!) translators to translate. And given the large number of mathematical operators I've just added to their plates, I think that's extremely important. Which brings me to:

Even adding all of these symbols, there's still a TON more that are still not in Orca or eSpeak. How important are ALL of those arrows? How important are the supplemental mathematical operators? What about the Arabic alphabetic symbols? Etc. Given the volunteer nature of the translation teams in GNOME, if you could identify what additional symbols are MUST-haves NOW because users reading mathematical content are likely to encounter them, I'm willing to add those (if Jonathan won't). After that, I suggest we close this bug and add further symbols as Orca users identify specific content they are unable to read.

Thoughts?

[1] https://git.gnome.org/browse/orca/commit/?id=d71fabab
Comment 3 Frédéric Wang 2015-01-09 07:57:01 UTC
The notion of "important" characters really depends on the math knowledge and study field of the reader... I guess for math-aware people it's a shame to have only obscure Unicode hexadecimal values read for their favorite math symbols, but it would be fine to just have the English Unicode description read (these people can certainly understand English and similar names are used for the LaTeX commands). This would save the translation efforts and the English Unicode description can be extracted automatically from http://www.w3.org/2003/entities/2007xml/unicode.xml

Moreover, I believe a first indication for "important" characters (that are worth being translated) are the symbols used on Wikipedia:

http://math-preview.wmflabs.org/wiki/Help:Displaying_a_formula

(use a Gecko browser so that the MathML is displayed by default)
Comment 4 Joanmarie Diggs (IRC: joanie) 2015-01-10 04:28:42 UTC
Ok, thanks for the feedback. Based on it, I have just committed [1] the following:

* Add ability to fallback on the name from unicode data
* Add fallbacks for arrows, shapes, and additional operators
* Add ability for the user to add and override symbol names

I've not YET added any additional strings marked for translation. There are quite a few "important" characters there that were not included in the large number of "important" characters I added yesterday for localization. Between the fallback and the ability to add/override symbols, end users who read technical content should be able to identify what strings need to be added and report them to us (using orca-customizations.py as an interim fix in the meantime).

Anything else you feel I should do to address the specific needs reported in this bug?

[1] https://git.gnome.org/browse/orca/commit/?id=5962a7ef
Comment 5 Frédéric Wang 2015-01-10 15:52:21 UTC
OK, that sounds good to me. Thanks.