After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 397439 - Performance enhancement patch series
Performance enhancement patch series
Status: RESOLVED FIXED
Product: vte
Classification: Core
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: VTE Maintainers
VTE Maintainers
Depends on:
Blocks:
 
 
Reported: 2007-01-16 23:48 UTC by Chris Wilson
Modified: 2007-01-17 18:50 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
replace fprintf(stderr) with g_printerr (89.30 KB, patch)
2007-01-16 23:49 UTC, Chris Wilson
none Details | Review
Avoid the run-time checked type casts when possible. (77.05 KB, patch)
2007-01-16 23:50 UTC, Chris Wilson
none Details | Review
Switch to G_DEFINE_TYPE for the simple cases. (27.93 KB, patch)
2007-01-16 23:50 UTC, Chris Wilson
none Details | Review
glib zeroes memory on free if G_DEBUG=gc-friendly, so we dont have to. (21.18 KB, patch)
2007-01-16 23:52 UTC, Chris Wilson
none Details | Review
s/g_array_new(TRUE, TRUE/g_array_new(FALSE, FALSE/ and verify the transformation is safe. (14.39 KB, patch)
2007-01-16 23:53 UTC, Chris Wilson
none Details | Review
We are rate-limited by the display update timeout, so the input processing timer is redundant (12.46 KB, patch)
2007-01-16 23:54 UTC, Chris Wilson
none Details | Review
Modify the cell locally before pushing it back to memory. (9.92 KB, patch)
2007-01-16 23:55 UTC, Chris Wilson
none Details | Review
Introduce a common find_row_data to simplify some code. (11.04 KB, patch)
2007-01-16 23:56 UTC, Chris Wilson
none Details | Review
s/(VteConv)-1/VTE_INVALID_CONV/ (6.20 KB, patch)
2007-01-16 23:56 UTC, Chris Wilson
none Details | Review
Avoid the double glyph lookup (6.48 KB, patch)
2007-01-16 23:57 UTC, Chris Wilson
none Details | Review
Micro-optimise the loops inside vteconv (1.49 KB, patch)
2007-01-16 23:57 UTC, Chris Wilson
none Details | Review
FT2: reuse the transformation of glyph->rgb inside a run (7.47 KB, patch)
2007-01-16 23:58 UTC, Chris Wilson
none Details | Review
Micro-optimise the loops inside vteseq (349 bytes, patch)
2007-01-16 23:58 UTC, Chris Wilson
none Details | Review
Don't load fonts whilst unrealized. (3.78 KB, patch)
2007-01-16 23:59 UTC, Chris Wilson
none Details | Review
Avoid the switch on matcher type by using a function table (and cache the match_func) (7.49 KB, patch)
2007-01-17 00:00 UTC, Chris Wilson
none Details | Review
Make the debugging checks inside the ring macros conditional on VTE_DEBUG (895 bytes, patch)
2007-01-17 00:00 UTC, Chris Wilson
none Details | Review
Make the debugging checks inside the ring macros conditional on VTE_DEBUG (895 bytes, patch)
2007-01-17 00:01 UTC, Chris Wilson
none Details | Review
Some more janitorial work and minor tweaks. (26.48 KB, patch)
2007-01-17 00:02 UTC, Chris Wilson
none Details | Review
Janitorial work and minor tweaks (refreshed). (27.27 KB, patch)
2007-01-17 00:07 UTC, Chris Wilson
none Details | Review
Switch to G_DEFINE_TYPE for the simple cases (updated for missing unref on VteReaper) (29.05 KB, patch)
2007-01-17 00:53 UTC, Chris Wilson
none Details | Review

Description Chris Wilson 2007-01-16 23:48:13 UTC
Attached are a series of patches aimed at fine tuning the implementation. No major surgery, just the result of lots of profiling and some janitorial work.

time hexdump 16M.wvm
Before:
local    0m27.228s
remote   3m25.101s

After:
local    0m23.257s
remote   2m54.964s

The really odd thing is that very rarely the hexdump runs at ~4x the speed on the slow system eg remote 0m56.524s. Of course when that happens I'm neither tracing nor profiling, so I have no idea what's going on.
Comment 1 Chris Wilson 2007-01-16 23:49:00 UTC
Created attachment 80441 [details] [review]
replace fprintf(stderr) with g_printerr
Comment 2 Chris Wilson 2007-01-16 23:50:22 UTC
Created attachment 80442 [details] [review]
Avoid the run-time checked type casts when possible.
Comment 3 Chris Wilson 2007-01-16 23:50:51 UTC
Created attachment 80443 [details] [review]
Switch to G_DEFINE_TYPE for the simple cases.
Comment 4 Chris Wilson 2007-01-16 23:52:11 UTC
Created attachment 80444 [details] [review]
glib zeroes memory on free if G_DEBUG=gc-friendly, so we dont have to.
Comment 5 Chris Wilson 2007-01-16 23:53:10 UTC
Created attachment 80445 [details] [review]
s/g_array_new(TRUE, TRUE/g_array_new(FALSE, FALSE/ and verify the transformation is safe.
Comment 6 Chris Wilson 2007-01-16 23:54:34 UTC
Created attachment 80446 [details] [review]
We are rate-limited by the display update timeout, so the input processing timer is redundant
Comment 7 Chris Wilson 2007-01-16 23:55:36 UTC
Created attachment 80447 [details] [review]
Modify the cell locally before pushing it back to memory.
Comment 8 Chris Wilson 2007-01-16 23:56:32 UTC
Created attachment 80448 [details] [review]
Introduce a common find_row_data to simplify some code.
Comment 9 Chris Wilson 2007-01-16 23:56:55 UTC
Created attachment 80449 [details] [review]
s/(VteConv)-1/VTE_INVALID_CONV/
Comment 10 Chris Wilson 2007-01-16 23:57:19 UTC
Created attachment 80450 [details] [review]
Avoid the double glyph lookup
Comment 11 Chris Wilson 2007-01-16 23:57:50 UTC
Created attachment 80451 [details] [review]
Micro-optimise the loops inside vteconv
Comment 12 Chris Wilson 2007-01-16 23:58:26 UTC
Created attachment 80452 [details] [review]
FT2: reuse the transformation of glyph->rgb inside a run
Comment 13 Chris Wilson 2007-01-16 23:58:49 UTC
Created attachment 80453 [details] [review]
Micro-optimise the loops inside vteseq
Comment 14 Chris Wilson 2007-01-16 23:59:10 UTC
Created attachment 80455 [details] [review]
Don't load fonts whilst unrealized.
Comment 15 Chris Wilson 2007-01-17 00:00:03 UTC
Created attachment 80456 [details] [review]
Avoid the switch on matcher type by using a function table (and cache the match_func)
Comment 16 Chris Wilson 2007-01-17 00:00:40 UTC
Created attachment 80457 [details] [review]
Make the debugging checks inside the ring macros conditional on VTE_DEBUG
Comment 17 Chris Wilson 2007-01-17 00:01:00 UTC
Created attachment 80458 [details] [review]
Make the debugging checks inside the ring macros conditional on VTE_DEBUG
Comment 18 Chris Wilson 2007-01-17 00:02:00 UTC
Created attachment 80459 [details] [review]
Some more janitorial work and minor tweaks.
Comment 19 Chris Wilson 2007-01-17 00:07:14 UTC
Created attachment 80461 [details] [review]
Janitorial work and minor tweaks (refreshed).
Comment 20 Behdad Esfahbod 2007-01-17 00:11:58 UTC
Doh!  That's a heck of a lot of large patches!

Ok, just one thing I spotted.  This doesn't look trivially right to me:

-				      vte_reaper_child_watch_destroyed);
+				      (GDestroyNotify)g_object_unref);

but it probably does to you.

Since I would not have time to go over these bit by bit in this life time, I suggest you commit them all as soon as you feel comfortable doing that, and (after committing them other optimization of yours) get a release out soon and wait for bugs to come in!

I also like to see the VTE_BACKEND patch land.

Good job!
Comment 21 Chris Wilson 2007-01-17 00:52:42 UTC
Darn, if you want to put in a good word with accounts@gnome.org... [2 days and waiting]

s/vte_reaper_child_watch_destroyed/g_object_unref/: in switching over to the exemplar GObject singleton code we acquire a new reference for every vte_reaper_get() [which is better for language bindings and using g_object_new(VTE_REAPER_TYPE) directly] hence the new requirement to unref.
Which of course points out the missed unref in vte.c
Comment 22 Chris Wilson 2007-01-17 00:53:50 UTC
Created attachment 80465 [details] [review]
Switch to G_DEFINE_TYPE for the simple cases (updated for missing unref on VteReaper)
Comment 23 Chris Wilson 2007-01-17 18:12:45 UTC
Commited: r1429-45.
Comment 24 Behdad Esfahbod 2007-01-17 18:29:56 UTC
(In reply to comment #15)
> Created an attachment (id=80456) [edit]
> Avoid the switch on matcher type by using a function table (and cache the
> match_func)

Note that this can be a little tricky.  A couple years ago I profile pango and saw a switch showing in the profiles and replaced it with a function table, but later I found that the reason the switch was showing is because the functions it calls were being inlined, while now it's impossible to avoid the function call overhead, plus function tables introduce more relocations and .data storage.
Comment 25 Chris Wilson 2007-01-17 18:50:34 UTC
Hmm, tricky indeed - many different tradeoffs. In the tests I ran I am reasonably confident (having to judge from time&sysprof&callgrind relative timings) that this represented a performance improvement, and not just from moving inlined code.

I'd say a nice thing to have would be something akin to cairo-perf...