GNOME Bugzilla – Bug 347357
DECtalk driver should convert strings to "code page 850"
Last modified: 2006-07-13 17:31:53 UTC
If you try to traverse over a line in the attached StarOffice document that has a non-breaking space in it ("\302\240"), then this generates COMM_FAILURES and multiple tracebacks (see attached output).
Created attachment 68839 [details] StarOffice test document.
Created attachment 68840 [details] Orca debug output generated when traversing StarOffice document.
This may be related to Tomas Cerha's comment that we're doing all our string management in Orca in UTF-8 when we probably should be using unicode strings and then translating to the character set expected by the things we use externally (e.g., gnome-speech and BrlTTY). Alternatively, what might also need to happen is that some sort of conversion should be done in the gnome-speech driver for DECtalk. It looks like the driver is passing UTF-8 strings to the DECtalk engine, which may not be what it is expecting. I'll contact the DECtalk folks to see what's legal.
After discussion with the DECtalk engineer at Fonix, we've determined that DECtalk expects strings from "code page 850". I'm reassigning this to gnome-speech, whose DECtalk driver should do the appropriate string conversion from UTF-8 to cp850. I've also checked to make sure we're not losing the other portion of this bug (handling the COMM_FAILURE for speech). It is already logged as bug 319531 (http://bugzilla.gnome.org/show_bug.cgi?id=319531).
Created attachment 68880 [details] [review] Patch to convert text to ISO8859-1 before sending to DECtalk engine After testing and more discussion with Fonix, we're now more certain that the encoding expected by DECtalk is ISO8859-1 and not code page 850. This patch provides the conversion to ISO8859-1.
Fixed in the development version. The fix will be available in the next major release. Thank you for your bug report.