GNOME Bugzilla – Bug 476903
Speed up gtkrc file parsing
Last modified: 2011-09-04 23:49:07 UTC
Parsing of gtkrc files can take quite some time. I've done two patches against gscanner.c and gtkrc.c which speed up parsing quite significantly. Attachments follow.
Created attachment 95587 [details] [review] Patch against gscanner.h
Created attachment 95588 [details] [review] Patch against gscanner.c
Created attachment 95589 [details] [review] Patch against gtkrc.c
Could you cook the gscanner.c attachment (Comment #2) to *just* contain your code changes and not lots of cosmetic whitespace adjustments to other things? Kinda hard to distinguish what you actually are doing here.
Also, a single patch helps review.
Take a look at (bug 415323). I also tried to implement tables there to improve the performance but it changes the API and breaks some apps (namely gimp). Read the comments from Tim Janik.
But this patch checks if the tables change and works fine with gimp :)
It does check for the string location changing but not for the string changing i.e. someone could change the characters in the contents of the cset without changing the pointer. I am in favor of this patch but it still changes the API. What might be an idea is to add a flag when creating the scanner to use the accelerated method while by default allowing users change the csets mid-flow if they so choose. Anyway, when one of the devs pitches in and says what should happen I would like to give a hand.
Created attachment 113101 [details] [review] Small optimization of gtkrc This is only an optimization that is independant from the much more complex gscanner changes. Would be nice of someone could take a quick look and decide if it's good or not. (Updated patch against trunk)
(In reply to comment #9) > Created an attachment (id=113101) [edit] > Small optimization of gtkrc > > This is only an optimization that is independant from the much more complex > gscanner changes. Would be nice of someone could take a quick look and decide > if it's good or not. > > (Updated patch against trunk) This kind of optimization is highly CPU and cache dependent, I'd like to see benchmarks on Intel and AMD CPUs for the particular change before applying it. (Yes, I know that it is a speed up on ARM.)
Created attachment 118521 [details] [review] Implement a perf test (In reply to comment #10) > (In reply to comment #9) > > Created an attachment (id=113101) [edit] > > Small optimization of gtkrc > > > > This is only an optimization that is independant from the much more complex > > gscanner changes. Would be nice of someone could take a quick look and decide > > if it's good or not. > > > > (Updated patch against trunk) > > This kind of optimization is highly CPU and cache dependent, I'd like to see > benchmarks on Intel and AMD CPUs for the particular change before applying it. > (Yes, I know that it is a speed up on ARM.) Here comes a test of three versions of the function, which is 1. the original one, 2. the optimized one and 3. one with G_LIKELY The optimization shows pretty clear improvements in the test. Here are my results from a Intel Core 2 with 2 GHz: [kalikiana@gyokuro gtk+]$ gtk/tests/is_c_identifier -m perf /is_c_identifier/old/all_true : time: 3.293302 OK /is_c_identifier/old/early_false: time: 3.246236 OK /is_c_identifier/old/late_false: time: 3.224693 OK /is_c_identifier/old/all_false: time: 3.228307 OK /is_c_identifier/new/all_true : time: 0.502090 OK /is_c_identifier/new/early_false: time: 0.504351 OK /is_c_identifier/new/late_false: time: 0.503468 OK /is_c_identifier/new/all_false: time: 0.503598 OK /is_c_identifier/likely/all_true : time: 0.490155 OK /is_c_identifier/likely/early_false: time: 0.489873 OK /is_c_identifier/likely/late_false: time: 0.490197 OK /is_c_identifier/likely/all_false: time: 0.490097 OK
As a follow up here are the results of running the same test on an N800. I did not manage to find an AMD machine to test on unfortunately. Nokia-N800-33-4:/home/user# ./is-c-identifier -m perf /is-c-identifier/old/all_true : time: 56.163055 OK /is-c-identifier/old/early_false: time: 55.015076 OK /is-c-identifier/old/late_false: time: 54.210205 OK /is-c-identifier/old/all_false: time: 54.216491 OK /is-c-identifier/new/all_true : time: 4.815491 OK /is-c-identifier/new/early_false: time: 4.821044 OK /is-c-identifier/new/late_false: time: 4.819793 OK /is-c-identifier/new/all_false: time: 4.813813 OK /is-c-identifier/likely/all_true : time: 4.844940 OK /is-c-identifier/likely/early_false: time: 4.842072 OK /is-c-identifier/likely/late_false: time: 4.843078 OK /is-c-identifier/likely/all_false: time: 4.847107 OK Quite astonishing numbers actually.
Created attachment 119426 [details] [review] Improved perf test Improved perf test, now including two additional variants of the is_c_identifier functions that actually use a table of booleans to compare characters to, in order to determine what is valid. The 'table' variant is actually slightly faster than the 'new' variant. Again, test run on an Intel Core Duo: [kalikiana@gyokuro gtk+]$ gtk/tests/is-c-identifier -m perf /is-c-identifier/old/all_true : time: 3.511582 OK /is-c-identifier/old/early_false: time: 0.524435 OK /is-c-identifier/old/late_false: time: 3.810310 OK /is-c-identifier/old/all_false: time: 0.580436 OK /is-c-identifier/new/all_true : time: 0.523331 OK /is-c-identifier/new/early_false: time: 0.068007 OK /is-c-identifier/new/late_false: time: 0.595931 OK /is-c-identifier/new/all_false: time: 0.083682 OK /is-c-identifier/likely/all_true : time: 0.502969 OK /is-c-identifier/likely/early_false: time: 0.086619 OK /is-c-identifier/likely/late_false: time: 0.559528 OK /is-c-identifier/likely/all_false: time: 0.067527 OK /is-c-identifier/table1/all_true : time: 0.291233 OK /is-c-identifier/table1/early_false: time: 0.062675 OK /is-c-identifier/table1/late_false: time: 0.316545 OK /is-c-identifier/table1/all_false: time: 0.119647 OK /is-c-identifier/table2/all_true : time: 0.307514 OK /is-c-identifier/table2/early_false: time: 0.063548 OK /is-c-identifier/table2/late_false: time: 0.327185 OK /is-c-identifier/table2/all_false: time: 0.063641 OK And on the N800: Nokia-N800-33-4:~# /home/user/is-c-identifier -m perf /is-c-identifier/old/all_true : time: 54.230805 OK /is-c-identifier/old/early_false: time: 11.060699 OK /is-c-identifier/old/late_false: time: 57.487946 OK /is-c-identifier/old/all_false: time: 14.223724 OK /is-c-identifier/new/all_true : time: 4.829224 OK /is-c-identifier/new/early_false: time: 0.969329 OK /is-c-identifier/new/late_false: time: 5.250793 OK /is-c-identifier/new/all_false: time: 1.351532 OK /is-c-identifier/likely/all_true : time: 4.852967 OK /is-c-identifier/likely/early_false: time: 0.948761 OK /is-c-identifier/likely/late_false: time: 5.280670 OK /is-c-identifier/likely/all_false: time: 1.352265 OK /is-c-identifier/table1/all_true : time: 4.070862 OK /is-c-identifier/table1/early_false: time: 0.914123 OK /is-c-identifier/table1/late_false: time: 4.360260 OK /is-c-identifier/table1/all_false: time: 1.189453 OK /is-c-identifier/table2/all_true : time: 4.984955 OK /is-c-identifier/table2/early_false: time: 0.991546 OK /is-c-identifier/table2/late_false: time: 5.293762 OK /is-c-identifier/table2/all_false: time: 1.295898 OK Note that the results are slightly different because I made a mistake when running the tests previously. This time all is proper.
With gtk3 not using GScanner anymore, I don't think this is worth keeping open.