Bug 579398 – [te_IN] Incorrect akhand formation for U+0C15,U+0C3E,U+0C37,U+0C47

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 579398 - [te_IN] Incorrect akhand formation for U+0C15,U+0C3E,U+0C37,U+0C47


Summary:	[te_IN] Incorrect akhand formation for U+0C15,U+0C3E,U+0C37,U+0C47


Status:	RESOLVED FIXED

Product:	pango
Classification:	Platform
Component:	indic
Version:	1.24.x
Hardware:	Other Linux

Importance:	Normal critical
Target Milestone:	---
Assigned To:	Pango Indic
QA Contact:	pango-maint

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2009-04-18 11:16 UTC by arjuna rao chavala
Modified:	2011-01-03 03:01 UTC

See Also:
GNOME target:	---
GNOME version:	2.21/2.22

Attachments
diff file for indic-fc.c (5.10 KB, patch) 2009-05-31 10:54 UTC, arjuna rao chavala	none	Details \| Review
expected result of 0C15+0C3E+0C37+0C4d, otherwise forming akhand combination (192.15 KB, image/png) 2009-10-03 09:25 UTC, sandeep		Details
ttf file for Lohit-Telugu handling 0c15+0c3E+0c37+0c4d => కాష్ combination (159.30 KB, application/x-font-ttf) 2009-10-03 09:35 UTC, sandeep		Details
test file for various k+sh akhand combinations (proper, improper) (1.24 KB, text/plain) 2009-10-17 06:32 UTC, arjuna rao chavala		Details
bitmap to illustrate the akhand rendering issue with file at https://bugzilla.gnome.org/show_bug.cgi?id=579398#c16 (132.75 KB, image/png) 2009-10-17 06:35 UTC, arjuna rao chavala		Details
Screenshot showing correct rendering of improper "k+sh" akhand combinations (219.18 KB, image/png) 2009-10-20 16:09 UTC, sandeep		Details
Lohit-Telugu.ttf handling improper "k+sh" akhand combinations (159.26 KB, application/x-font-ttf) 2009-10-20 16:17 UTC, sandeep		Details
test file for various k+sh akhand combinations (proper, improper) (1.09 KB, text/plain) 2009-10-31 11:35 UTC, arjuna rao chavala		Details
Incorrect renderings annotated (gedit+PANGO 2.24) (117.77 KB, image/png) 2009-10-31 12:44 UTC, arjuna rao chavala		Details
Correct rendering of Telugu akhand forms on Open office 2.4 (103.43 KB, image/png) 2009-10-31 12:46 UTC, arjuna rao chavala		Details
Fixed with patch at 135652 shown using pango-view (67.39 KB, image/png) 2009-10-31 13:48 UTC, arjuna rao chavala		Details
K+sh akhand forms rendered perfectly with Lohit Kannada. (127.12 KB, image/png) 2009-11-01 05:53 UTC, arjuna rao chavala		Details
K+sh akhand forms rendered perfectly with Lohit Hindi without any need for change to Pango1.24. (117.97 KB, image/png) 2009-11-01 05:54 UTC, arjuna rao chavala		Details
Patch for processing GSUB of indic text at syllable level (22.38 KB, patch) 2010-08-13 12:10 UTC, arjuna rao chavala	none	Details \| Review
Portable test case for indic (this file is Telugu) (1.81 KB, text/plain) 2010-08-13 12:24 UTC, arjuna rao chavala		Details
Screenshot of Telugu Akhand problem (149.23 KB, image/png) 2010-08-13 12:29 UTC, arjuna rao chavala		Details
Screenshot of Telugu Akhand problem-resolution (149.55 KB, image/png) 2010-08-13 12:32 UTC, arjuna rao chavala		Details
Code cleaned up to comply with glib (22.53 KB, patch) 2010-08-19 07:07 UTC, arjuna rao chavala	none	Details \| Review

Description arjuna rao chavala 2009-04-18 11:16:11 UTC

Akhand should be formed only for U+0C15,U+0C47,U+0C37,[Viram or dependent vowel signs, ex: U+0C3E]
However the given sequence U+0C15,U+0C3E,U+0C37,U+0C47, akhand is formed by interchange of 2nd and 3rd unicode characters.
కా+ష్ = కా ష్ (without intervening space), but is becoming akhand as క్షా 
when tested with Lohit-te and Pothana2000 fonts. These fonts work fine on windows.

Telugu engine needs to be fixed.

Comment 1 arjuna rao chavala 2009-04-18 23:44:05 UTC

Added a bug in Fedora lohit-telugu font, as it is not clear where the actual bug is (https://bugzilla.redhat.com/show_bug.cgi?id=494902 )

Comment 2 arjuna rao chavala 2009-05-08 23:47:57 UTC

Studying https://bugzilla.redhat.com/show_bug.cgi?id=223170, it  is clear that this bug is with Pango, as it does not look up at syllable level. Closing the other bug  with respect to font.

Comment 3 arjuna rao chavala 2009-05-09 06:53:28 UTC

verified that  this problem does not occur  with KDE (Kubuntu 8.10)

Comment 4 arjuna rao chavala 2009-05-31 10:54:43 UTC

Created attachment 135652 [details] [review]
diff file for indic-fc.c

Fixes the  problem by working reordering at syllable level

Comment 5 arjuna rao chavala 2009-05-31 10:58:26 UTC

The  patch at previous comment   was inspired   from  the patch   at  this bug
http://bugzilla.gnome.org/show_bug.cgi?id=447035.

Comment 6 arjuna rao chavala 2009-05-31 11:06:31 UTC

Comment on attachment 135652 [details] [review]
diff file for indic-fc.c

Patch fixes this problem. Please change the status to  fixed

Comment 7 arjuna rao chavala 2009-05-31 11:40:05 UTC

Upon closer examination, it is found that rendering is duplicated and editing  is not intutive. needs some more changes reverting to  open state

Comment 8 arjuna rao chavala 2009-05-31 11:41:27 UTC

Rendering of words is the problem  and not at the syllable level.

Comment 9 arjuna rao chavala 2009-06-07 12:26:03 UTC

Investigated more:
Found that display is fine in editors like gedit, bluefish, open office. There is a problem with cursor positioning when editing.
In Firefox, In each sentence or part sentence, the first word  appearance once and the remaining sentence(till  appropriate non space break), repeats second time making it difficult to read.

Comment 10 arjuna rao chavala 2009-06-07 12:38:45 UTC

Firefox behavior could be understood better from this bug https://bugzilla.mozilla.org/show_bug.cgi?id=157967

Comment 11 Parag AN 2009-10-01 11:09:06 UTC

with pango-1.26.0 on F-12 I got
కాషే

So this looks fixed already.

Please close this bug.

Comment 12 arjuna rao chavala 2009-10-01 17:51:46 UTC

(In reply to comment #11)
> with pango-1.26.0 on F-12 I got
> కాషే
> 
> So this looks fixed already.
> 
> Please close this bug.

Can you  post a picture(.jpg or .png) to verify the result. On my 1.24 version, it displays totally different glyphs than what is expected. Hence I can't verify it.

Comment 13 sandeep 2009-10-03 09:25:28 UTC

Created attachment 144657 [details]
expected result of 0C15+0C3E+0C37+0C4d, otherwise forming akhand combination

Sequence 0C15+0C3E+0C37+0C47 => కాషే  (consists of 2 syllables: కా and షే)
logically doesn't form an akhand formation.


Sequence 0c15+0c3E+0c37+0c4d => కాష్ 
(after removing the GSUB rule from lohit-telugu-fonts, otherwise leading to akhand formation previously.)

Comment 14 sandeep 2009-10-03 09:35:30 UTC

Created attachment 144658 [details]
ttf file for Lohit-Telugu handling 0c15+0c3E+0c37+0c4d => కాష్  combination

From Comment #13,

Although, it seemed that correct behaviour of 0c15+0c3E+0c37+0c4d => కాష్ can be obtained at font-level, I would still request Arjuna Rao Chavala to verify for any regressions.

Comment 15 sandeep 2009-10-15 04:43:19 UTC

ping Arjuna Rao Chavala, for comment 14.

Comment 16 arjuna rao chavala 2009-10-17 06:32:40 UTC

Created attachment 145648 [details]
test file for various k+sh akhand combinations (proper, improper)

It has worked for the  specific dependent vowel. Other dependent vowel combinations also need to be fixed.
I enclose the test file  that can be opened  with gedit to illustrate the problem.

Comment 17 arjuna rao chavala 2009-10-17 06:35:44 UTC

Created attachment 145649 [details]
bitmap to  illustrate the akhand rendering issue with  file at https://bugzilla.gnome.org/show_bug.cgi?id=579398#c16

Added a  bit map file illustrating the status of the rendering problem with the  ttf patch at comment #14 on pango 1.24/Ubuntu 8.10

Comment 18 sandeep 2009-10-20 16:09:08 UTC

Created attachment 145872 [details]
Screenshot showing correct rendering of improper "k+sh" akhand combinations

Comment 19 sandeep 2009-10-20 16:17:05 UTC

Created attachment 145874 [details]
Lohit-Telugu.ttf handling improper "k+sh" akhand combinations

From Comment 16,

Please, test for required combination. 
Let me know for issues encountered (if any).

Comment 20 sandeep 2009-10-21 09:16:04 UTC

Pango failed to render incorrect implementation of OpenType features (GSUB) within the font file, leading to improper handling of "k+sh" akhand combination.

OpenType features (GSUB) have now been corrected within the font file
(see Comment 19). 

That said, the bug looks fixed and can be closed safely.

Comment 21 arjuna rao chavala 2009-10-21 16:26:27 UTC

Sorry for my oversight with the initial fix.  Akhand is totally gone, as is clear from the fix. As Akhand where the second letter (sh) is shaped differently, when it is followed by the (k), is  a popular letter shape in Telugu, I would not  like to go ahead with total removal of Akhand, Looks like the syllable level parsing is still the need.

Comment 22 arjuna rao chavala 2009-10-31 11:35:13 UTC

Created attachment 146636 [details]
test file for various k+sh akhand combinations (proper, improper)

Updated version  fixing  a problem line

Comment 23 arjuna rao chavala 2009-10-31 12:44:30 UTC

Created attachment 146637 [details]
Incorrect renderings annotated (gedit+PANGO 2.24)

Incorrect renderings identified for telugu akhand  forms

Comment 24 arjuna rao chavala 2009-10-31 12:46:14 UTC

Created attachment 146638 [details]
Correct rendering of Telugu akhand forms on Open office 2.4

Coompare with attachment 146637 [details] for the problem with  pango 2.24

Comment 25 arjuna rao chavala 2009-10-31 13:48:47 UTC

Created attachment 146641 [details]
Fixed with patch at 135652 shown using pango-view

Comment 26 arjuna rao chavala 2009-10-31 13:51:30 UTC

Fix for Gnome/Pango done. As this breaks Firefox on Gnome,  which uses different text layout(word wrapping code), it is better to wait for Firefox also to be fixed. Will file a firefox bug and link here.

Comment 27 arjuna rao chavala 2009-11-01 05:53:29 UTC

Created attachment 146668 [details]
K+sh akhand  forms rendered perfectly with Lohit Kannada.

Comment 28 arjuna rao chavala 2009-11-01 05:54:36 UTC

Created attachment 146669 [details]
K+sh akhand  forms rendered perfectly with Lohit Hindi without any need for change to Pango1.24.

Comment 29 arjuna rao chavala 2009-11-01 05:56:43 UTC

As per comment 27 and 28,  it is not certain that problem is with Pango

Comment 30 arjuna rao chavala 2010-01-03 17:15:41 UTC

(In reply to comment #27)
> Created an attachment (id=146668) [details]
> K+sh akhand  forms rendered perfectly with Lohit Kannada.
On further investigation,  I found that Kedage font worked fine for this example and the problem existed for Lohit Kannada. However, Kedage font also fails for another  sequence ka+ka followed by Sha, as mentioned in  https://bugzilla.gnome.org/show_bug.cgi?id=604060

Comment 31 arjuna rao chavala 2010-07-04 12:15:01 UTC

The issue remains with 1.28 (Ubuntu 10.04LTS). I have  built debian packages with the patch and found it to be solving both  Telugu and Kannada issues. Firefox 3.6.3 still does not work well with the patch for Telugu. Chrome is available now for Linux and so can be used as a backup till Firefox fixes the issue.  Exploring  testers for other languages of Indic module,  to progress this futher.

Comment 32 arjuna rao chavala 2010-08-13 12:10:39 UTC

Created attachment 167797 [details] [review]
Patch for processing GSUB of indic text  at syllable level

Patch fixes this bug of rendering of Akhands in Telugu and also Kannada rendering bugs(604060). This may require further  optimization and code cleanup. Tested against Lohit Telugu font (Ubuntu-10.04) and found to be working with gedit and Firefox. Patch was produced with SVN diff command against 1.28.0 version

Comment 33 arjuna rao chavala 2010-08-13 12:24:20 UTC

Created attachment 167798 [details]
Portable test case for indic (this file is Telugu)

By opening this text file in any text rendering application, the effect of font, OS-text-application and text processing library can be observed.
Other language versions can be created by tool at http://girgit.chitthajagat.in/

Comment 34 arjuna rao chavala 2010-08-13 12:29:15 UTC

Created attachment 167800 [details]
Screenshot of Telugu Akhand problem

In the image #3: 1to 7, 9 to 11 13 to 15 incorrect (other forms on this line not relevent for Telugu)

Comment 35 arjuna rao chavala 2010-08-13 12:32:00 UTC

Created attachment 167802 [details]
Screenshot of Telugu Akhand problem-resolution

#3:1 to 7, 9 to 11 ,13 to 15 shows that Akhands are not formed where they are not relevant (compare with the previous attachment for problem)

Comment 36 arjuna rao chavala 2010-08-13 12:33:07 UTC

Patch may need code cleanup and optimization

Comment 37 arjuna rao chavala 2010-08-19 07:07:40 UTC

Created attachment 168266 [details] [review]
Code cleaned up to comply with glib  

Improved patch with code clean up and compliance of changes to glib.

Further optimization possibility of not copying the glyph_position info exist. Can be implemented after the feedback from Maintainer.
Tested on Ubuntu10.04 and found to be working well.

Comment 38 arjuna rao chavala 2010-12-29 08:43:40 UTC

Request updation on the trunk, as the patch has been used on Ubuntu system supporting Hindi, Kannada and Telugu without adverse impact for over 6 months. As we do not know when harfbuzz shaper for Indic will be ready, I request immediate priority to release this into trunk, so that Telugu and Kannada users can use their languages with out any rendering errors.

Comment 39 Behdad Esfahbod 2011-01-03 00:58:20 UTC

Not going to happen...

Comment 40 arjuna rao chavala 2011-01-03 02:48:14 UTC

Behdad,

Not a helpful comment(39). Please provide what practical things you need to make it happen. I have provided the patches for testing to indic community. If there are people who care about the indic rendering, they should have tested and given feedback. Alternately why do not you provide beta so that people can give feedback. Can you atleast share the plans for harfbuzz-ng indic shaper, so that we know it is not far away?

Comment 41 Behdad Esfahbod 2011-01-03 02:51:38 UTC

How about you join the HarfBuzz list and write about your plans to develop a harfbuzz-ng Indic shaper?  One thing to keep in mind though: no change to the harfbuzz API is allowed.

Comment 42 arjuna rao chavala 2011-01-03 03:01:51 UTC

I have been  on Harfbuzz list for quite some time. I have written to Jonathan, as he was mentioned to be leading the effort for indic, but did not get a response. I can certainly take it up, though I will need help to get started. If indic requires some changes, I will of course propose the same and will be happy with any decision based on the merits of the case.