After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 320638 - patch to speed up gtk_text_iter_backwards_chars
patch to speed up gtk_text_iter_backwards_chars
Status: RESOLVED FIXED
Product: gtk+
Classification: Platform
Component: Widget: GtkTextView
2.8.x
Other All
: Normal minor
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2005-11-03 17:59 UTC by Larry Ewing
Modified: 2005-12-12 19:17 UTC
See Also:
GNOME target: ---
GNOME version: 2.13/2.14


Attachments
patch to optimize walking backward in long segments (1.19 KB, text/plain)
2005-11-03 18:03 UTC, Larry Ewing
  Details
speed up the backwards walking too (1.89 KB, text/plain)
2005-11-08 19:42 UTC, Larry Ewing
  Details
proof of concept that walking backwards can be fast too (1.23 KB, patch)
2005-11-08 20:24 UTC, Larry Ewing
none Details | Review
proof of concept that walking backwards can be fast (the proper patch) (851 bytes, patch)
2005-11-08 20:27 UTC, Larry Ewing
none Details | Review
use the correctly signed values for offset. (845 bytes, patch)
2005-11-08 20:33 UTC, Larry Ewing
none Details | Review
full patch with stutter stepping (709 bytes, patch)
2005-11-08 20:46 UTC, Larry Ewing
none Details | Review
really the full patch (1.89 KB, patch)
2005-11-08 20:50 UTC, Larry Ewing
none Details | Review
the I am undeniably stupid patch (1.93 KB, patch)
2005-11-08 20:54 UTC, Larry Ewing
none Details | Review
my hacked up pango-language-profile.c to measure utf8 movement (8.00 KB, text/plain)
2005-11-08 20:55 UTC, Matthias Clasen
  Details
benchmark for the stutter step patch (1.33 KB, patch)
2005-11-09 03:37 UTC, Larry Ewing
none Details | Review
stutter better (723 bytes, patch)
2005-11-10 19:28 UTC, Matthias Clasen
none Details | Review
another hacked benchmark (8.25 KB, text/plain)
2005-11-10 20:13 UTC, Matthias Clasen
  Details
unit tests for g_utf8_offset_to_pointer (1.47 KB, text/plain)
2005-11-11 05:18 UTC, Matthias Clasen
  Details
pango patch to use negative pointer_to_offset (1.12 KB, patch)
2005-11-17 13:04 UTC, Behdad Esfahbod
committed Details | Review
small bench to figure out the weight function (1.64 KB, text/plain)
2005-11-17 19:13 UTC, Paolo Borelli
  Details
doc patch (667 bytes, patch)
2005-11-17 19:25 UTC, Paolo Borelli
none Details | Review
updated textiter patch (2.42 KB, patch)
2005-11-17 22:27 UTC, Paolo Borelli
none Details | Review

Description Larry Ewing 2005-11-03 17:59:29 UTC
this patch improves the backwards_char algorithm by walking backwards in the
segment when the distance to walk backwards is shorter than the length forwards.

Other information:
Comment 1 Larry Ewing 2005-11-03 18:03:59 UTC
Created attachment 54288 [details]
patch to optimize walking backward in long segments

right now the weighs the cost of walking forward the same as walking backwards,
this isn't really the case so the if (count < real->segment_char_offset)
condition should probably be weighted to prefer walking forwards (count * some
multiple < real->segement_char_offset).
Comment 2 Larry Ewing 2005-11-03 18:13:49 UTC
It is also important to note that  gtk_text_iter_backward_char calls 
gtk_text_iter_backward_chars with count 1 and is itself called internally in
gtk_text_iter in many places
Comment 3 Federico Mena Quintero 2005-11-03 18:24:45 UTC
We need a benchmark for this.  Maybe it's easy to pull out the search/replace
code from Gedit, which was using this function a lot.
Comment 4 Matthias Clasen 2005-11-07 18:32:22 UTC
The factors I get when testing g_utf8_next/prev_char() on the data in
pango-profiles vary between 13 times faster and 300 times faster for 
moving forwards through a string. 300 seems a little high, but it is 
definitively a lot faster.
Comment 5 Larry Ewing 2005-11-08 17:38:45 UTC
Clearly, but on a 4k paragaph 300 is still a win
Comment 6 Larry Ewing 2005-11-08 19:42:10 UTC
Created attachment 54480 [details]
speed up the backwards walking too

This is a variation on the original approach.  We change offset_to_pointer to
allow negative offsets (before it would just run off into wildly into memory)
and then call offset_to_pointer from text_iter_backwards_chars
Comment 7 Larry Ewing 2005-11-08 20:24:36 UTC
Created attachment 54485 [details] [review]
proof of concept that walking backwards can be fast too

new patch to speed up offset_to_pointer with negative values

we step backwards offset chars then walk to the beginning of the character we
landed in.  next we count how many actual characters we stepped back then
repeat
Comment 8 Larry Ewing 2005-11-08 20:27:08 UTC
Created attachment 54486 [details] [review]
proof of concept that walking backwards can be fast  (the proper patch)

new patch to speed up offset_to_pointer with negative values

we step backwards offset chars then walk to the beginning of the character we
landed in.  next we count how many actual characters we stepped back then
repeat
Comment 9 Larry Ewing 2005-11-08 20:33:43 UTC
Created attachment 54488 [details] [review]
use the correctly signed values for offset.

this is the same as the previous two patches but with using negative offsets
correctly
Comment 10 Larry Ewing 2005-11-08 20:46:43 UTC
Created attachment 54491 [details] [review]
full patch with stutter stepping

this is the full patch, hopefully it compiles ;)
Comment 11 Larry Ewing 2005-11-08 20:50:18 UTC
Created attachment 54492 [details] [review]
really the full patch
Comment 12 Larry Ewing 2005-11-08 20:54:36 UTC
Created attachment 54493 [details] [review]
the I am undeniably stupid patch

real patch take 3
Comment 13 Matthias Clasen 2005-11-08 20:55:08 UTC
Created attachment 54494 [details]
my hacked up pango-language-profile.c to measure utf8 movement
Comment 14 Larry Ewing 2005-11-09 03:37:51 UTC
Created attachment 54517 [details] [review]
benchmark for the stutter step patch

this should benchmark the patch fairly well.
Comment 15 Larry Ewing 2005-11-09 06:54:27 UTC
using the attached build.c but bumping the iterations to 1000000

without the patch I get

real    0m7.364s
user    0m7.332s
sys     0m0.020s

with the patch I get 

real    0m4.708s
user    0m4.668s
sys     0m0.016s
Comment 16 Matthias Clasen 2005-11-10 19:28:34 UTC
Created attachment 54606 [details] [review]
stutter better
Comment 17 Matthias Clasen 2005-11-10 20:13:23 UTC
Created attachment 54608 [details]
another hacked benchmark
Comment 18 Matthias Clasen 2005-11-11 05:18:51 UTC
Created attachment 54614 [details]
unit tests for g_utf8_offset_to_pointer

Here are some unit tests for g_utf8_offset_to_pointer and
g_utf8_pointer_to_offset, which my version passes. 
Not that you will have to add

if (pos < str) return - g_utf8_pointer_to_offset (pos, str);

to g_utf8_pointer_to_offset to make it pass the tests.
Comment 19 Behdad Esfahbod 2005-11-17 13:04:10 UTC
Created attachment 54874 [details] [review]
pango patch to use negative pointer_to_offset

Attaching Pango patch to use negative-enabled g_utf8_pointer_to_offset (as per
Matthias' last comment), to make cursor movement linear instead of quadratic.
Comment 20 Matthias Clasen 2005-11-17 15:54:08 UTC
I have committed stutter stepping to glib HEAD, so we can do the gtktextiter
changes in gtk+ HEAD now.
Comment 21 Paolo Borelli 2005-11-17 19:13:49 UTC
Created attachment 54884 [details]
small bench to  figure out the weight function

I used this small bench to figure out when it is worth to go backward (the
bench uses a string of 64 four-byte chars which should be the worse case): here
I got that going forward is almost 3 times faster, thus going backward is worth
if offset falls in the last fourth of the string.
Comment 22 Paolo Borelli 2005-11-17 19:25:13 UTC
Created attachment 54888 [details] [review]
doc patch

I propose to add this information to the docs so that walking backwards is not
abused. Obviously provided that others see similar numbers.
Comment 23 Behdad Esfahbod 2005-11-17 21:06:51 UTC
I don't like the doc patch that much.  The goal of handling negative offsets is
that it makes many operations constant-time instead of O(N).  That is, when you
want to move locally.  I don't see it used with an if statement that much.
Comment 24 Paolo Borelli 2005-11-17 22:27:11 UTC
Created attachment 54891 [details] [review]
updated textiter patch

updated version of lewing's textiter patch to use the weight function measured
above + same stuff in gtktextbtree

bench of comment #14 with 1000000 iterations goes from 10s (unpatched) to 6s
(textiter patch) to 2s (textiter + btree)
Comment 25 Paolo Borelli 2005-11-17 22:30:58 UTC
(as noted on irc, writing the weight function as '4 *' instead of '/ 4' may be
safer... I assumed compiler was smart enough to figure that out these days. I
doubt it matter anyways)
Comment 26 Matthias Clasen 2005-11-18 20:29:36 UTC
When stepping backwards through the testtext example with the patch applied, it
segfaults for me when hitting the arabic text.
Comment 27 Behdad Esfahbod 2005-11-18 23:06:55 UTC
Pango patch committed after fixing.
Comment 28 Matthias Clasen 2005-12-12 19:17:06 UTC
Ok, couldn't reproduce my earlier problems. Committed to GTK+ HEAD.

2005-12-12  Matthias Clasen  <mclasen@redhat.com>

	* gtk/gtktextbtree.c (_gtk_text_line_char_to_byte_offsets): 
	* gtk/gtktextiter.c (gtk_text_iter_backward_chars): Speed up
	stepping backwards.  (#320638, Larry Ewing, Paolo Borelli)