GNOME Bugzilla – Bug 579398
[te_IN] Incorrect akhand formation for U+0C15,U+0C3E,U+0C37,U+0C47
Last modified: 2011-01-03 03:01:51 UTC
Akhand should be formed only for U+0C15,U+0C47,U+0C37,[Viram or dependent vowel signs, ex: U+0C3E] However the given sequence U+0C15,U+0C3E,U+0C37,U+0C47, akhand is formed by interchange of 2nd and 3rd unicode characters. కా+ష్ = కా ష్ (without intervening space), but is becoming akhand as క్షా when tested with Lohit-te and Pothana2000 fonts. These fonts work fine on windows. Telugu engine needs to be fixed.
Added a bug in Fedora lohit-telugu font, as it is not clear where the actual bug is (https://bugzilla.redhat.com/show_bug.cgi?id=494902 )
Studying https://bugzilla.redhat.com/show_bug.cgi?id=223170, it is clear that this bug is with Pango, as it does not look up at syllable level. Closing the other bug with respect to font.
verified that this problem does not occur with KDE (Kubuntu 8.10)
Created attachment 135652 [details] [review] diff file for indic-fc.c Fixes the problem by working reordering at syllable level
The patch at previous comment was inspired from the patch at this bug http://bugzilla.gnome.org/show_bug.cgi?id=447035.
Comment on attachment 135652 [details] [review] diff file for indic-fc.c Patch fixes this problem. Please change the status to fixed
Upon closer examination, it is found that rendering is duplicated and editing is not intutive. needs some more changes reverting to open state
Rendering of words is the problem and not at the syllable level.
Investigated more: Found that display is fine in editors like gedit, bluefish, open office. There is a problem with cursor positioning when editing. In Firefox, In each sentence or part sentence, the first word appearance once and the remaining sentence(till appropriate non space break), repeats second time making it difficult to read.
Firefox behavior could be understood better from this bug https://bugzilla.mozilla.org/show_bug.cgi?id=157967
with pango-1.26.0 on F-12 I got కాషే So this looks fixed already. Please close this bug.
(In reply to comment #11) > with pango-1.26.0 on F-12 I got > కాషే > > So this looks fixed already. > > Please close this bug. Can you post a picture(.jpg or .png) to verify the result. On my 1.24 version, it displays totally different glyphs than what is expected. Hence I can't verify it.
Created attachment 144657 [details] expected result of 0C15+0C3E+0C37+0C4d, otherwise forming akhand combination Sequence 0C15+0C3E+0C37+0C47 => కాషే (consists of 2 syllables: కా and షే) logically doesn't form an akhand formation. Sequence 0c15+0c3E+0c37+0c4d => కాష్ (after removing the GSUB rule from lohit-telugu-fonts, otherwise leading to akhand formation previously.)
Created attachment 144658 [details] ttf file for Lohit-Telugu handling 0c15+0c3E+0c37+0c4d => కాష్ combination From Comment #13, Although, it seemed that correct behaviour of 0c15+0c3E+0c37+0c4d => కాష్ can be obtained at font-level, I would still request Arjuna Rao Chavala to verify for any regressions.
ping Arjuna Rao Chavala, for comment 14.
Created attachment 145648 [details] test file for various k+sh akhand combinations (proper, improper) It has worked for the specific dependent vowel. Other dependent vowel combinations also need to be fixed. I enclose the test file that can be opened with gedit to illustrate the problem.
Created attachment 145649 [details] bitmap to illustrate the akhand rendering issue with file at https://bugzilla.gnome.org/show_bug.cgi?id=579398#c16 Added a bit map file illustrating the status of the rendering problem with the ttf patch at comment #14 on pango 1.24/Ubuntu 8.10
Created attachment 145872 [details] Screenshot showing correct rendering of improper "k+sh" akhand combinations
Created attachment 145874 [details] Lohit-Telugu.ttf handling improper "k+sh" akhand combinations From Comment 16, Please, test for required combination. Let me know for issues encountered (if any).
Pango failed to render incorrect implementation of OpenType features (GSUB) within the font file, leading to improper handling of "k+sh" akhand combination. OpenType features (GSUB) have now been corrected within the font file (see Comment 19). That said, the bug looks fixed and can be closed safely.
Sorry for my oversight with the initial fix. Akhand is totally gone, as is clear from the fix. As Akhand where the second letter (sh) is shaped differently, when it is followed by the (k), is a popular letter shape in Telugu, I would not like to go ahead with total removal of Akhand, Looks like the syllable level parsing is still the need.
Created attachment 146636 [details] test file for various k+sh akhand combinations (proper, improper) Updated version fixing a problem line
Created attachment 146637 [details] Incorrect renderings annotated (gedit+PANGO 2.24) Incorrect renderings identified for telugu akhand forms
Created attachment 146638 [details] Correct rendering of Telugu akhand forms on Open office 2.4 Coompare with attachment 146637 [details] for the problem with pango 2.24
Created attachment 146641 [details] Fixed with patch at 135652 shown using pango-view
Fix for Gnome/Pango done. As this breaks Firefox on Gnome, which uses different text layout(word wrapping code), it is better to wait for Firefox also to be fixed. Will file a firefox bug and link here.
Created attachment 146668 [details] K+sh akhand forms rendered perfectly with Lohit Kannada.
Created attachment 146669 [details] K+sh akhand forms rendered perfectly with Lohit Hindi without any need for change to Pango1.24.
As per comment 27 and 28, it is not certain that problem is with Pango
(In reply to comment #27) > Created an attachment (id=146668) [details] > K+sh akhand forms rendered perfectly with Lohit Kannada. On further investigation, I found that Kedage font worked fine for this example and the problem existed for Lohit Kannada. However, Kedage font also fails for another sequence ka+ka followed by Sha, as mentioned in https://bugzilla.gnome.org/show_bug.cgi?id=604060
The issue remains with 1.28 (Ubuntu 10.04LTS). I have built debian packages with the patch and found it to be solving both Telugu and Kannada issues. Firefox 3.6.3 still does not work well with the patch for Telugu. Chrome is available now for Linux and so can be used as a backup till Firefox fixes the issue. Exploring testers for other languages of Indic module, to progress this futher.
Created attachment 167797 [details] [review] Patch for processing GSUB of indic text at syllable level Patch fixes this bug of rendering of Akhands in Telugu and also Kannada rendering bugs(604060). This may require further optimization and code cleanup. Tested against Lohit Telugu font (Ubuntu-10.04) and found to be working with gedit and Firefox. Patch was produced with SVN diff command against 1.28.0 version
Created attachment 167798 [details] Portable test case for indic (this file is Telugu) By opening this text file in any text rendering application, the effect of font, OS-text-application and text processing library can be observed. Other language versions can be created by tool at http://girgit.chitthajagat.in/
Created attachment 167800 [details] Screenshot of Telugu Akhand problem In the image #3: 1to 7, 9 to 11 13 to 15 incorrect (other forms on this line not relevent for Telugu)
Created attachment 167802 [details] Screenshot of Telugu Akhand problem-resolution #3:1 to 7, 9 to 11 ,13 to 15 shows that Akhands are not formed where they are not relevant (compare with the previous attachment for problem)
Patch may need code cleanup and optimization
Created attachment 168266 [details] [review] Code cleaned up to comply with glib Improved patch with code clean up and compliance of changes to glib. Further optimization possibility of not copying the glyph_position info exist. Can be implemented after the feedback from Maintainer. Tested on Ubuntu10.04 and found to be working well.
Request updation on the trunk, as the patch has been used on Ubuntu system supporting Hindi, Kannada and Telugu without adverse impact for over 6 months. As we do not know when harfbuzz shaper for Indic will be ready, I request immediate priority to release this into trunk, so that Telugu and Kannada users can use their languages with out any rendering errors.
Not going to happen...
Behdad, Not a helpful comment(39). Please provide what practical things you need to make it happen. I have provided the patches for testing to indic community. If there are people who care about the indic rendering, they should have tested and given feedback. Alternately why do not you provide beta so that people can give feedback. Can you atleast share the plans for harfbuzz-ng indic shaper, so that we know it is not far away?
How about you join the HarfBuzz list and write about your plans to develop a harfbuzz-ng Indic shaper? One thing to keep in mind though: no change to the harfbuzz API is allowed.
I have been on Harfbuzz list for quite some time. I have written to Jonathan, as he was mentioned to be leading the effort for indic, but did not get a response. I can certainly take it up, though I will need help to get started. If indic requires some changes, I will of course propose the same and will be happy with any decision based on the merits of the case.