GNOME Bugzilla – Bug 113764
Support a non ASCII dictionary
Last modified: 2004-12-22 21:47:04 UTC
I recently spoke with Gunnar Schmidt and Olaf Jan Schmidt of KDE accessibility and they told me that they are using the GOK dictionary file for word completion within KDE. However they did raise one issue: currently the dictionary file does not specify its encoding. It would probably be good to give the dictionary file and internationalization a bit of thought. What encoding are we currently (implicitly) assuming the file is in? How should we handle different character sets? Only support UTF-8 or extend the dictionary file to include a declaration of its encoding?
I think we should use UTF-8; but there'd be no harm in putting in a header of some sort to make this explicit.
I meant to say, we are currently assuming UTF-8 I think; if not I think we should since it's ASCII compatible.
In KMouth, we also use UTF-8, so if you you that as well, the format is identical.
I think that all that's required here is to make sure that we are using UTF8-compatible functions for reading and writing the GOK dictionaries. glib provides plenty of them so this should be an easy fix; we just need to revise gok/word-complete.c to use UTF8 routines for parsing the dictionary, and use appropriate editors for revising it.
This will require some revision (and testing!) of gok/word-complete.c I think gok/gok-word-complete.c is already UTF-8 ready now, but not 100% sure. Let's try and do this before 2.4 string freeze, so the translators might be able to help us here.
bumping up severity since it blocks usefulness of localization.
adding accessibility keyword
fix for this included in work on bug 107200. We can now import UTF-8 dictionaries (as well as ISO-8865-1 dictionaries) in the gok_i18n branch.
I think it would be advisable to include the encoding in the first line of the GOK dictionary; i.e. WPDictList UTF-8
fixed in HEAD.