Bug 321216 – gnome-speech driver for festival does not handle UTF-8 text

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 321216 - gnome-speech driver for festival does not handle UTF-8 text


Summary:	gnome-speech driver for festival does not handle UTF-8 text


Status:	RESOLVED DUPLICATE of bug 141516

Product:	gnome-speech
Classification:	Deprecated
Component:	drivers
Version:	0.3.x
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Willie Walker
QA Contact:	GNOME Speech Maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2005-11-11 11:30 UTC by Chaitanya Kamisetty
Modified:	2006-07-01 16:25 UTC

See Also:
GNOME target:	---
GNOME version:	2.9/2.10

Attachments
Patch created against CVS that fixes the bug (787 bytes, patch) 2005-11-11 11:33 UTC, Chaitanya Kamisetty	committed	Details \| Review

Description Chaitanya Kamisetty 2005-11-11 11:30:50 UTC

Please describe the problem:
The festival synthesis driver in gnome-speech-0.3.8 explicitly sets the channel
encoding to ISO-8859-1 This is causing a problem when using the driver with
UTF-8 encoded text. I am trying to use the driver for Indian language text which
is UTF-8 encoded and am facing this problem.

Steps to reproduce:
I am using my own build of festival TTS which can speak out Telugu language text
represented in UTF-8. When I tried to use this festival with gnopernicus screen
reader and opened an application with the locale set to Telugu, gnopernicus
could not read out the tool tips etc. This turned out to be because of the
festival synthesis driver which was not passing the text messages correctly to
the festival speech synthesis server.

Actual results:
Telugu UTF-8 text is clipped off.

Expected results:
Text in encodings other than ISO-8859-1 is not passed to the festival server
correctly.

Does this happen every time?
Yes

Other information:

Comment 1 Chaitanya Kamisetty 2005-11-11 11:33:32 UTC

Created attachment 54628 [details] [review]
Patch created against CVS that fixes the bug

The patch sets the channel encoding to UTF-8

Comment 2 Willie Walker 2006-05-14 14:49:36 UTC

Fixed in the development version. The fix will be available in the next major release. Thank you for your bug report.

Comment 3 bill.haneman 2006-06-28 18:03:17 UTC

This patch seems to have problems - the channel was being set explicitly to ISO-8859-1 because most Festival non-english voices seem to use that encoding.  The new patch regresses support for those voices.

Comment 4 Sunil Mohan Adapa 2006-06-29 06:20:38 UTC

Now, what about Telugu (festival-te.sf.net) and other Indian languages?

UTF-8 should be treated neutral and not ISO8859-1. Non-english Festival voices should be treated as broken to that effect. If fixing those voices is not an easy task, then perhaps exceptions for broken non-english voices based on the selected voice should be added as hacks to gnome-speech.

Will a patch be accepted on these lines?

Comment 5 bill.haneman 2006-06-29 09:20:21 UTC

I don't agree that UTF-8 should be treated as "neutral".  The TTS engine we are using, festival, was written before UTF-8 was commonplace, and its expectations and voices must determine what we do here.

ISO-8859-1 is the most common encoding for festival voices.  You may think this is old-fashioned, but it is not something we control, and thus ISO-8859-1 makes a reasonable default.  We do of course need to provide some ability to use the appropriate encoding for a given voice.  I don't think that festival allows us to determine that, so we will have to provide some configuration table.

Bill

Comment 6 Willie Walker 2006-07-01 16:25:26 UTC

Historical related discussion to this bug has taken place in bug 141516, which was really two separate bugs.  But, it does cover encoding issues as well as autodection of voices.  So...I'm going to mark this bug as a duplicate of 141516, reopen 141516, and move the discussion there.

*** This bug has been marked as a duplicate of 141516 ***