After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 89541 - Iteration by characters, clusters broken
Iteration by characters, clusters broken
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
1.0.x
Other Linux
: High normal
: 1.8.2
Assigned To: pango-maint
pango-maint
: 118786 120891 125990 126832 127331 127481 127650 127728 128445 130459 130501 134196 149050 149202 156997 304414 305493 307977 308158 308267 310092 325583 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2002-07-31 15:56 UTC by Owen Taylor
Modified: 2006-01-03 09:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Start of a test program (2.28 KB, text/plain)
2002-07-31 15:57 UTC, Owen Taylor
  Details
Simple testcase (1.21 KB, text/plain)
2005-01-22 10:50 UTC, Ilya Konstantinov
  Details
bidi support for PangoLayoutIter (6.06 KB, patch)
2005-02-21 00:07 UTC, Amit Aronovitch
none Details | Review
Patch regenerated with diff -up (9.49 KB, patch)
2005-05-05 19:14 UTC, Owen Taylor
needs-work Details | Review
reimplement char & cluster iteration (11.96 KB, patch)
2005-05-29 22:19 UTC, Amit Aronovitch
none Details | Review
test program for PangoLayoutIter (6.55 KB, text/plain)
2005-06-01 21:25 UTC, Amit Aronovitch
  Details
Patch I'm applying to head (28.52 KB, patch)
2005-06-21 22:35 UTC, Owen Taylor
none Details | Review

Description Owen Taylor 2002-07-31 15:56:32 UTC
Using PangoLayoutIter to iterate by characters or clusters doesn't 
work at all currently, due to a variety of problems, in particular:

 - Confusion over whether indices are 
 - Problems with figuring out what the end index is for a 
   rtl run. (The end index for the ltr run "abc" is 3 - 
   0,1,2 are character indices. But for a rtl run CBA
   the character indices are 2,1,0 and currently, the
   end index is 0 as well, which doesn't work.)
Comment 1 Owen Taylor 2002-07-31 15:57:06 UTC
Created attachment 10153 [details]
Start of a test program
Comment 2 Owen Taylor 2003-07-31 18:23:09 UTC
*** Bug 118786 has been marked as a duplicate of this bug. ***
Comment 3 Owen Taylor 2003-10-31 17:31:10 UTC
*** Bug 120891 has been marked as a duplicate of this bug. ***
Comment 4 Owen Taylor 2003-10-31 17:45:46 UTC
Fixed the test case in 120891.

Fri Oct 31 12:32:38 2003  Owen Taylor  <otaylor@redhat.com>
 
        Fix one problem with iteration by chars (Part of
        #89541, Mariano Suárez-Alvarez)
 
        * Pango/pango-layout.c (cluster_end_index): Fix to
        be item relative, like iter->cluster_index.
 
        * pango/pango-layout.c (pango_layout_iter_next_char):
        Adapt.
Comment 5 Olav Vitters 2003-11-01 16:30:40 UTC
*** Bug 125990 has been marked as a duplicate of this bug. ***
Comment 6 Olav Vitters 2003-11-12 20:04:11 UTC
*** Bug 126832 has been marked as a duplicate of this bug. ***
Comment 7 Mariano Suárez-Alvarez 2003-11-18 21:22:02 UTC
*** Bug 127331 has been marked as a duplicate of this bug. ***
Comment 8 Olav Vitters 2003-11-20 16:12:35 UTC
*** Bug 127481 has been marked as a duplicate of this bug. ***
Comment 9 Olav Vitters 2003-11-22 12:49:15 UTC
*** Bug 127650 has been marked as a duplicate of this bug. ***
Comment 10 Olav Vitters 2003-11-23 13:02:19 UTC
*** Bug 127728 has been marked as a duplicate of this bug. ***
Comment 11 Olav Vitters 2003-12-03 21:58:58 UTC
*** Bug 128445 has been marked as a duplicate of this bug. ***
Comment 12 Heath Harrelson 2003-12-03 22:08:57 UTC
*** Bug 128445 has been marked as a duplicate of this bug. ***
Comment 13 Olav Vitters 2004-01-03 18:10:25 UTC
*** Bug 130459 has been marked as a duplicate of this bug. ***
Comment 14 Olav Vitters 2004-01-04 11:37:16 UTC
*** Bug 130501 has been marked as a duplicate of this bug. ***
Comment 15 Mariano Suárez-Alvarez 2004-02-12 16:54:55 UTC
*** Bug 134196 has been marked as a duplicate of this bug. ***
Comment 16 Owen Taylor 2004-02-21 18:55:31 UTC
Major amount of work. First step is probalby coming up with
test cases.
Comment 17 Owen Taylor 2004-08-10 17:19:15 UTC
*** Bug 149050 has been marked as a duplicate of this bug. ***
Comment 18 Matthew Gatto 2004-11-04 03:49:20 UTC
*** Bug 149202 has been marked as a duplicate of this bug. ***
Comment 19 Matthew Gatto 2004-11-04 03:50:08 UTC
*** Bug 156997 has been marked as a duplicate of this bug. ***
Comment 20 Ilya Konstantinov 2005-01-08 23:57:44 UTC
This still seems to be a crasher for bug 154148, despite bug 120891 being
resolved. The crasher in bug 154148 practically disables using Nautilus to
manage Hebrew-named files.

I'll try to come up with testcases next week.
Comment 21 Yuval Tanny 2005-01-21 14:42:11 UTC
It makes nautilus to crash a lot. see bug 149759 or bug 149202. Because it's a
crasher, the severity should change to critical, and the priority to high.
I think this bug should be solved before 2.10 goes out.

Thanks,
Yuval
Comment 22 Owen Taylor 2005-01-21 16:20:22 UTC
Folks - the bug in general here is a huge amount of ill-defined work.
If you have *particular* crashers, *particular* test cases would be 
extremely useful.
Comment 23 Ilya Konstantinov 2005-01-22 10:50:41 UTC
Created attachment 36370 [details]
Simple testcase

Particular testcase? You've got it. Iterates over a simple string of "<HEBREW
ALEPH><HEBREW BET><HEBREW GIMEL>" (first three letters in the Hebrew alphabet).
Crashes on Pango from CVS.

(BTW, if you shorten the testcase's string to just "<ALEPH><BET>", it'll crash
due to a different assertion. Shorten it to just "<ALEPH>", and it'll crash due
to an even different one.)
Comment 24 Amit Aronovitch 2005-02-21 00:07:51 UTC
Created attachment 37729 [details] [review]
bidi support for PangoLayoutIter

I have made this patch, which seems to solve the probs on my machine.
It should be tested with other forms of text (including Arabic, and also Latin
with combining marks).
Please read detailed documentation in
http://live.gnome.org/PangoLayoutIterBidiSupport
Comment 25 Ilya Konstantinov 2005-04-09 23:38:49 UTC
Minding this is a known crasher with a patch, is there any work being done to
review this (now that GNOME 2.10 release date is over)?
Comment 26 Sven Neumann 2005-04-30 12:08:06 UTC
Raising priority on this bug-report since it seems to be a problem that needs to
be solved and it has a nicely documented patch as well as a test-case attached.
Comment 27 Owen Taylor 2005-05-05 19:14:24 UTC
Created attachment 46068 [details] [review]
Patch regenerated with diff -up

I've regenerated the patch using 'diff -up'. A default format diff
is almost impossible to read, so you should always use 'diff -u'.
Comment 28 Owen Taylor 2005-05-05 20:30:58 UTC
Comment on attachment 46068 [details] [review]
Patch regenerated with diff -up

I've commented extensively on the Wiki page about how
I think the iteration should work, and given some
thoughts about implementation technique.
I'm not going to comment on the current patch in 
detail since I don't think it quite implements
the right thing though.

Let me know if you are going to have time to 
work on implementing what   I sketched out.
Otherwise, I'll try to find some time to do it myself.
Comment 29 Amit Aronovitch 2005-05-09 21:57:06 UTC
Very little time for at least one week. 
You'd probably do a better job, but if your'e busy and willing to wait ~2 weeks,
I'll give it a go (your sketched plan seems good).

Please have a look at my wiki updates, to make sure I understand correctly. I
might decide to have a look at the code during my vacation this weekend.
Comment 30 Amit Aronovitch 2005-05-29 22:19:04 UTC
Created attachment 47014 [details] [review]
reimplement char & cluster iteration

please review.
I also have a test program which I'll upload after some cleanup
Comment 31 Amit Aronovitch 2005-06-01 21:25:32 UTC
Created attachment 47113 [details]
test program for PangoLayoutIter

checks iteration by character. 
verifies x coords and total number of iterations.

todo:
 * add support for \r\n
 * consider alternatives for using gtk
 * other test texts (read from file)
Comment 32 Martin Wehner 2005-06-06 20:46:00 UTC
*** Bug 305493 has been marked as a duplicate of this bug. ***
Comment 33 Islam Amer 2005-06-10 17:00:26 UTC
I don't know if it is needed, but I felt it is important to report that the
patch fixes the nautilus crash with arabic filenames I have been experiencing.
Thanks.
Comment 34 Martin Wehner 2005-06-18 12:00:19 UTC
*** Bug 307977 has been marked as a duplicate of this bug. ***
Comment 35 Martin Wehner 2005-06-18 12:01:11 UTC
*** Bug 308158 has been marked as a duplicate of this bug. ***
Comment 36 Sebastien Bacher 2005-06-19 12:22:21 UTC
*** Bug 308267 has been marked as a duplicate of this bug. ***
Comment 37 Sebastien Bacher 2005-06-19 12:22:56 UTC
*** Bug 304414 has been marked as a duplicate of this bug. ***
Comment 38 Owen Taylor 2005-06-21 22:35:44 UTC
Created attachment 48135 [details] [review]
Patch I'm applying to head

Here's what I'm applying to Pango HEAD. If no problems turn up in the
next few days, I'll backport to the pango-1-8 branch.
Comment 39 Owen Taylor 2005-06-21 22:38:37 UTC
Notes on differences:

Fixed various occurrences of ++i

Fixed various usages of gint, gchar to int, char

Rewrote update_cluster() logic; next_logical_cluster_start_rtl(),
  run_glyph_to_index() were confusing me because they didnt' have a
  meaning that could be described in less than the amount of code
  which was in them.

In update_cluster(), glyphs->num_glyphs() can never be zero, so
  removed check and comment.

Renamed textp to text, pcluster to cluster_text.

In pango_layout_iter_next_char() eliminated inline preincrement.

The need for the tolerance of 1 in checking cluster extents in the
  test case actually was a real problem: consider a 3 character cluster
  with width 10. With was there before, you'd get, for the three clusters
  (0, 3) (3, 6) (6, 9), while the right result is
  (0, 3) (3, 6) (6, 10) [keeping the current truncation behavior]:

  I rewrote pango_layout_iter_get_char_extents to compute the
  left and right boundaries of the cluster.

The need for "extra_eol" in the test case actually was a real problem;
  next_char() and next_cluster() should always go to the next char
  resp. next cluster when there are more chars/clusters, and
  should not include the fake end-of-line runs. Fixing this up
  was quite a bit of work, and I probably broke something...

I reworked the test case a fair bit:
 - Made it use Cairo instead of GTK+ for creating a context
 - Clean up the use of varargs macros which aren't portable
 - Add a simple test for cluster iteration

2005-06-21  Owen Taylor  <otaylor@redhat.com>

        Fix up the operation of PangoLayoutIter, especially for Bidi
        (#89541, based on a patch from Amit Aronovitch)

        * pango/pango-layout.c: Many changes to make iteration
        consistently in visual order.

        * pango/pango-layout.c (pango_layout_iter_next_char): Iterate
        through each character in the layout exactly once. (Including
        a hack to get two iterator positions for \r\n)

        * pango/pango-layout.c (pango_layout_iter_next_cluster): Only
        iterate through real clusters: that is, positions in the
        layout that have glyphs.

        * tests/testiter.c tests/Makefile.am: Add a (somewhat reworked)
        test from Amit for the operation of PangoLayoutIter.

Comment 40 Sebastien Bacher 2005-07-12 09:32:27 UTC
*** Bug 310092 has been marked as a duplicate of this bug. ***
Comment 41 Owen Taylor 2005-07-21 20:03:34 UTC
Committed to pango-1-8 as well.
Comment 42 Sebastien Bacher 2006-01-03 09:41:52 UTC
*** Bug 325583 has been marked as a duplicate of this bug. ***