GNOME Bugzilla – Bug 156781
Add Lao Support to Thai Module
Last modified: 2004-12-22 21:47:04 UTC
Lao script is so close to Thai that it can be carried out with similar logic. So, I think it's reasonable to extend existing Thai module to cover Lao as well. My proposed plan is to replace the internal 8-bit TIS-620 code with Lao and Thai codepages. Then, the WTT compose/input checking table can be enhanced to cover some rare cases existing in old-style Lao script. And some other details can be generalized from Thai terms.
Created attachment 33188 [details] [review] Proposed patch I have prepared a first patch based on a discussion with Anousak Souphavanh, a native speaker from Laonux project.
Created attachment 33189 [details] Sample Lao text, with two intentional errors.
Created attachment 33190 [details] Sample text screenshot as rendered in GtkTextView and GtkEntry.
Since Anousak is not registered yet and I thus can't Cc: him, his contact address is anousak@muanglao.com.
Added Anousak to Cc: list, as he's now registered.
Created attachment 33233 [details] [review] Updated patch, with comments and cleanups
Created attachment 33343 [details] [review] Updated patch, with character properties factorized for use with GtkIM module. I propose to also separate the character properties tables from other parts, as they can be shared with GtkIM module. (In the patch, it's moved into thai-charprop.[ch]) I'll update the patch for Bug #81031 to use the same tables in supporting Thai/Lao input as well.
Set target milestone to 1.8.0, as the proposed patch is tested.
This is to reply to the sample Lao text. The position of vowels, example SARA U, UU, E and EE, were NOT centered. This is perhaps NOT the font rendering engine and it is something that font makers had to use the Open Type Specs accordingly.
The mark positioning can be adjusted by changing the GPOS data in the font. So, I think it should be OK to commit the patch. Patch committed to HEAD. Bug resolved as FIXED. Let's reopen if any issue is found. 2004-11-28 Theppitak Karoonboonyanan <thep@linux.thai.net> Add Lao support to Thai module. (#156781) * modules/thai/Makefile.am modules/thai/thai-shaper.[ch] +modules/thai/thai-charprop.[ch]: Split WTT tables into a separate source. Extend the tables for Lao. 3 new classes are added (AM for SaraAm, AD4 for Nikkhahit, BCON for Lao semivowels). Now the range 0x00-0x7f in TIS is used to store Lao characters. Rewrite ucs2tis() et al macros accordingly. * modules/thai/thai-shaper.c (get_next_cluster): Rewrite the clusterization code, so it's not specific to Thai-English texts. (Note that the special case of SaraAm is now handled by the new WTT character class. So, the extra checks are now eliminated.) * modules/thai/thai-shaper.c (get_glyphs_list, add_cluster): Add glyph calculation for Lao clusters. * modules/thai/thai-shaper.c (ThaiShapeTable structs, get_adjusted_glyphs_list): Generalize the shaping maps according to the new 8-bit internal encoding scheme. Now the character ranges are relocatable rather than hard-coded. Add Lao shaping table. * modules/thai/thai-shaper.c (get_adjusted_glyphs_list): Add special case for Lao, where clusters can be longer than those of Thai. * modules/thai/thai-fc.c (get_glyph_index_tis): Add Lao glyphs lookup. * modules/thai/thai-ot.c (thai_ot_shape, +lao_ot_get_ruleset): Add Lao OT rulesets retrieval. * modules/thai/thai-fc.c (PangoEngineScriptInfo thai_scripts[]): Add Lao script entry.