GNOME Bugzilla – Bug 105500
gdk_x11_gc_set_values performance
Last modified: 2004-12-22 21:47:04 UTC
I am seeing 7% of gnumeric's drawing time spent in gdk_x11_gc_set_values. Not good. There are X=24606 calls in my sample. (All from gdk_gc_set_values.) 56.8% of the time is spent in 4X calls to g_type_check_instance_cast. 12.6% of the time is spent in 1X calls to g_type_check_instance_is_a 6.6% of the time is spent in 3X calls to _gdk_gc_x11_get_type It seems to me that the same thing is being checked over and over again. Can we cut away 7/8 of those?
The calls to g_type_check_instance_cast and all but one call to _gdk_gc_x11_get_type shouldn't be there on a production build of GTK+. Was your build done with --enable-debug? If so, it won't be representative of what user's see.
My bad. I will re-open if it shows up after --enable-debug=no.
It still shows up as 1X since GDK_IS_GC remains. That's still about 1% of gnumeric's time re-checking what gdk_gc_set_values already did. (Which in turn rechecks gdk_gc_set_clip_origin/gdk_gc_set_ts_origin/ gdk_gc_set_foreground.) ...for a total of 7+% of gnumeric's drawing time spent in g_type_check_instance_is_a. (This was actually with --enable-debug=minimum; I'm not exactly sure what level to check at.)
Hmm, you are right, the check in gdk_x11_gc_set_values() should be removed.. the policy in GTK+ is no type checks in virtual function implementations. --enable-debug=minimum is right. That's the value used for production builds. If you want to start removing type checks to improve that 7% in g_type_check_instance_is_a(), try taking them out of g_object_ref(), g_object_unref(). gdk_gc_set_values() doesn't seem like a useful point to me to look at type checks. The cost of a few type checks is going to be less than the cost of actually changing the GC when we have to push the data o the server. (BTW ... you are looking at the total CPU utilization when drawing, right? It doesn't do much good to try to micro-optimize GTK+ if the X server is using 75% of the CPU, or if you are using a remote display and GTK+ is spending 75% of the wall time waiting on the network.) In general I don't think 7% overhead is awful for the benefits that the g_return_val_if_fail() give us, though removing a few select checks (g_object_ref/unref, say) may make sense.
> (BTW ... you are looking at the total CPU utilization > when drawing, right? It doesn't do much good to > try to micro-optimize GTK+ if the X server is using > 75% of the CPU, or if you are using a remote display > and GTK+ is spending 75% of the wall time waiting > on the network.) I am benchmarking the two functions in gnumeric that deal with rendering (creating formatted strings from values and stuffing them into pango layouts) and drawing (putting those layouts on the screen). Specifically, I am using quantify and tell it to start recording in the beginning of those functions, and to stop at the end. You can feel the difference with my current patch set. A 100Mbps ethernet is not going to be the bottleneck for text drawing anytime soon. (Actually, it might, but so far I use GDK_USE_XFT=0.) Currently, the single most expensive function for the above subset of gnumeric is g_type_check_instance_is_a -- that's a bit worrisome.
Created attachment 14528 [details] [review] Caching type check information in TypeNodes
Could you try the patch I just attached and see how it affects your benchmark? The patch caches typecheck information in TypeNodes (ie., two words per class). My limited tests indicate a hit rate of 85-95%, but I haven't measured if it is actually an improvement, and if so how big an improvement.
Doesn't work ... think thread safety.
I don't understand. How can there be incorrect information in the cache? Even if one thread is reading from the cache while another is writing, it doesn't mean the reading thread will get wrong or obsolete information.
Is this jury still out on whether this works? (I.e., are GType assignments assumed to be atomic? They had better be, or else things like gdk_gc_get_type are not right either.) Anyway, g_type_check_instance_is_a currently is at 8% since other things got faster. On top of that, add 1.2% for gdk_gc_get_type. Why isn't that just a simple variable reference?
Created attachment 14641 [details] [review] Remove one check per comment above
My current opinion is that it works, if and only if writes to GType sized variables are atomic. This is probably a pretty safe assumption. I'm a little hesitant in general in putting caching in on something that is in many cases constant time. (Constant time for inheritance, log(n_interfaces) for interfaces.) Since it's just going to slow things down on the miss route and increase code size a bit. Still, if benchmarking indicates that it makes a noticeable improvement, it probably makes sense. In any case, could you file a separate bug against GObject with your patch, Soeren, and close this one? I've applied Morten's GDK patch to both branches. Wed May 14 16:00:51 2003 Owen Taylor <otaylor@redhat.com> * gdk/x11/gdkgc-x11.c: Remove unecessary g_return_if_fail(). (#105500, Morten Welinder.)
Filed as bug 113116.