After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 585626 - Setting widget tooltip hammers X11 server on any TCP/IP X11 connection
Setting widget tooltip hammers X11 server on any TCP/IP X11 connection
Status: RESOLVED FIXED
Product: gtk+
Classification: Platform
Component: Widget: Other
2.16.x
Other All
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks: 585624
 
 
Reported: 2009-06-13 06:44 UTC by Craig Ringer
Modified: 2009-06-22 14:12 UTC
See Also:
GNOME target: ---
GNOME version: 2.25/2.26


Attachments
Backtraces taken during long startup pause on tcp/ip connection (22.65 KB, text/plain)
2009-06-13 07:02 UTC, Craig Ringer
  Details
Connection and startup of gtkhtml as captured by tcpdump on loopback iface (280.53 KB, application/cap)
2009-06-13 07:15 UTC, Craig Ringer
  Details
This patch dramatically reduces startup time (661 bytes, patch)
2009-06-13 07:41 UTC, Craig Ringer
none Details | Review
gtk+ git patch 0f00d3fdb084eac236072361df19e030d390ea9b, see comments (3.61 KB, patch)
2009-06-22 03:20 UTC, Craig Ringer
none Details | Review

Description Craig Ringer 2009-06-13 06:44:09 UTC
Please describe the problem:
The gtkhtml editor control takes a very long time (at least 16 seconds) to appear when it's being displayed over any TCP/IP X11 connection. This happens even on a loopback interface connection to the local X server, but does not happen on a UNIX socket.

I found this bug while tracking down an issue with Evolution's compose/reply window taking a long time to appear on my LTSP thin clients (see bug 585624, and referenced Launchpad bug).

I've just found that it occurs with gtkhtml-editor-test from current gtkhtml svn (r9202, 2009-04-10 18:17:33 +0800).

Steps to reproduce:
1. Enable TCP/IP on your X server by editing gdm.conf and setting DisableTCP=false, then restarting X
2. run "DISPLAY=:0.0 /opt/gnome2/bin/gtkhtml-editor-test" . Note that the window appears promptly.
3. run "DISPLAY=127.0.0.1:0.0 /opt/gnome2/bin/gtkhtml-editor-test". Note that the window takes a LONG time to appear, but then behaves normally.


Actual results:
The first time, on a UNIX domain socket, the window appears promptly. On the second it takes a long time to appear.

Expected results:
The window should appear very promptly both times.

Does this happen every time?
Yes, on several different hosts and with at least two versions of gtkhtml (svn trunk head, and 2.11.1-2ubuntu1).

Other information:
Makes Evolution unusable on remote X11 connections such as when used on thin clients.
Comment 1 Craig Ringer 2009-06-13 07:02:07 UTC
Stack trace attached. I launched a Xephyr session:
  
  /usr/bin/Xephyr :11

Then I started gtkhtml-editor-test from a terminal in my main X server as:

  DISPLAY=127.0.0.1:11 gdb --args /opt/gnome2/bin/gtkhtml-editor-test

I set up logging and "start"ed the app. Every few seconds I interrupted execution with control-C, ran "thread apply all bt" to get stack traces of all threads, and "cont"inued execution.

The window appeared shortly after the final backtrace was taken.
Comment 2 Craig Ringer 2009-06-13 07:02:45 UTC
Created attachment 136486 [details]
Backtraces taken during long startup pause on tcp/ip connection
Comment 3 Craig Ringer 2009-06-13 07:15:39 UTC
Created attachment 136487 [details]
Connection and startup of gtkhtml as captured by tcpdump on loopback iface

Note that tcp checksums are incorrect in this trace because the kernel implements TSO on the loopback interface, as shown by:
  sudo ethtool --show-offload  lo | grep 'tcp segmentation offload'
When opening the packet trace in Wireshark, ignore the errors. 

Alternately, you can disable the option "Check the validity of the TCP checksum when possible" in the TCP dissector preferences if desired, so Wireshark won't check them anymore.

Once you've turned off checksum validation, tell Wireshark to analyze the traffic as X11 by right-clicking on the first packet, selecting "Decode as...", and in the dialog that appears selecting "X11" from the list. Select the second packet and do the same thing to see the replies decoded the same way.

You can now see what gtkhtml is chatting to the server about. If you scroll to any random place in the trace you'll now be able to see the same endlessly repeating sequence of X11 operations:

Client -> Server: Request: GrabServer, QueryPointer
Server -> Client: Reply: QueryPointer
Client -> Server: Request: UngrabServer
Comment 4 Craig Ringer 2009-06-13 07:26:37 UTC
One direct culprit is the GtkHTML color combo widget. It looks like color_combo_new_swatch_button(...) calls gtk_widget_set_tooltip_text(...) which for some bizarre reason grabs the server. If I comment out the call to gtk_widget_set_tooltip_text in components/editor/gtkhtml-color-combo.c, startup falls to about 5 seconds - we save *TEN* *SECONDS* off the widget startup time.
Comment 5 Craig Ringer 2009-06-13 07:41:08 UTC
Created attachment 136488 [details] [review]
This patch dramatically reduces startup time
Comment 6 Craig Ringer 2009-06-13 07:56:18 UTC
Note that the attached patch is NOT the right fix. It's a test, and confirms where the problem is.
Comment 7 Craig Ringer 2009-06-13 08:07:22 UTC
The remaining calls to gtk_widget_set_tooltip_text(...) seem to be causing the rest of the delay. When all the direct calls are commented out it's still a couple of full seconds slower over TCP than over a UNIX socket, but a couple of seconds is a lot better than 15.

Breaking into startup at random points shows that the remaining delay is almost entirely other calls to gtk_widget_set_tooltip_text(...) within glade, via GObject properties, or the like.
Comment 8 Matthew Barnes 2009-06-19 12:25:36 UTC
Great analysis!  I'm going to move this over to GTK+ to see if they have any ideas on this.  Also adjusting summary.
Comment 9 Matthias Clasen 2009-06-19 14:39:57 UTC
There's a couple of avenues for improvement here:

- We should probably not call gtk_widget_trigger_tooltip_query when setting the tooltip on a widget that is not visible

- It is worth investigating if we can do better than just calling 
gtk_tooltip_trigger_tooltip_query. We are throwing away some information here, since we have a concrete widget (and thus window) where we want to trigger tooltip changes, and gtk_tooltip_trigger_tooltip query then goes and recomputes a window from the pointer position...

- Finally, the X implementation of _gdk_windowing_window_at_pointer looks like it could really do with an XFixes request to make it less expensive.
Comment 10 Matthias Clasen 2009-06-19 15:13:19 UTC
Some comments from Owen:

- Can also move the trigger tooltip query to an idle

- Client-side windows may allow a 0-cost implementation of _gdk_windowing_window_at_pointer since it already needs to keep track of pointer windows, presumably.
Comment 11 Matthias Clasen 2009-06-20 20:34:36 UTC
I've committed some of these ideas now. Let me know if it doesn't help.

commit 0f00d3fdb084eac236072361df19e030d390ea9b
Author: Matthias Clasen <mclasen@redhat.com>
Date:   Sat Jun 20 13:54:33 2009 -0400

    Reduce roundtrips
    
    Setting a tooltip on a widget unfortunately triggers several roundtrips
    to the X server. We reduce this overhead by only doing it if the
    widget is visible, and by deferring to an idle. See bug 585626.
Comment 12 Craig Ringer 2009-06-22 03:13:57 UTC
I've reverted to stock libgtkhtml, built gtk master/HEAD, and tested with that. There's a huge improvement, to the point where I'm not sure opening the compose window is even any slower over TCP/IP than over a UNIX socket now.

I'll grab that diff and see if I can adapt it to 2.16.1 and rebuild the ubuntu gtk package with it so the users of the thin client systems I run can benefit.

I really appreciate your taking a look at this. It looks like a huge performance win, and maybe not just in the particular area I first noticed it in.
Comment 13 Craig Ringer 2009-06-22 03:20:03 UTC
Created attachment 137150 [details] [review]
gtk+ git patch 0f00d3fdb084eac236072361df19e030d390ea9b, see comments

For Ubuntu/Debian users who may run into this and who might want to fix it with minimal impact to their systems, here's how to rebuild your gtk+ package to include the patch:

mkdir $HOME/gtk-build
cd $HOME/gtk-build
apt-get source libgtk2.0-0
sudo apt-get build-dep libgtk2.0-0
sudo apt-get install fakeroot
cd gtk+2.0-*
patch -p1 < /path/to/mclasen-0f00d3fdb084eac236072361df19e030d390ea9b.diff
fakeroot debian/rules binary

Now install the gtk+ debs created in $HOME/gtk-build .
Comment 14 Craig Ringer 2009-06-22 03:29:24 UTC
BTW, does it still seem like it's worth investigating an XFixes addition for finding the window at the pointer?
Comment 15 Craig Ringer 2009-06-22 14:12:02 UTC
Better rebuild instructions:

mkdir $HOME/gtk-build
cd $HOME/gtk-build
apt-get source libgtk2.0-0
sudo apt-get build-dep libgtk2.0-0
sudo apt-get install fakeroot devscripts
cd gtk+2.0-*
wget -O - http://bugzilla.gnome.org/attachment.cgi?id=137150 | patch -p1
debuild -i -j2 -tc


Once the packages have built and you've' installed them, you might want to pin them in place in /etc/apt/preferences. I've written a little script to generate pinning entries. Run this in the directory the .deb files were generated to pin them all to your custom versions.

for f in *.deb; do
  dpkg-deb -e $f
  sed -e '/^Package: / p' \
      -e '/^Version: / s/Version: \(.*\)$/Pin: version \1 origin=""/ p' \
      -e '/^Pin: / aPin-Priority: 1001' \
      -e 'D' \
      DEBIAN/control
  echo
done | sudo tee -a /etc/apt/preferences

Just remember to delete the pinning entries when the upstream packages are updated or when you want to upgrade! You can use "apt-cache policy packagename" to find out why it's held back.