GNOME Bugzilla – Bug 635626
GDBus message idle can execute while flushes are pending
Last modified: 2011-09-06 01:53:47 UTC
Created attachment 175115 [details] [review] Adds a flag to check if a flush is pending The idle task will execute, and it checks to see if any writes are pending before trying to send another message. If this isn't done, then an assert will fail when trying to write the message that checks to see if there are any pending events on the output stream. In the case of flushes the idle function is not checking. So what will happen is that if there is a flush pending on the output stream, the idle will try to send another message and the assert will fail as the flush is pending on the output stream. The program will then exit. To handle this case I added an additional flag to track if flushes are pending. Since there can only be one set of flushes queued a boolean works fine for this. Unfortunately it means getting the write mutex an additional time when handling the flush callback, but otherwise is pretty benign.
This sounds complicated. I would prefer having a test case demonstrating the bug (as well as the bug fix, of course). Thanks.
This is a race condition between two threads. I can't think of a test case that would show this without instrumenting GOutputStream to block in odd ways, which honestly, is quite a lot of work. Otherwise any test case would only be statistically relevant. The way that I was able to recreate it to find this bug is using a dbusmenu test that sets up a client and a server with 5 proxies, which basically hammers DBus. Even then it didn't happen every time, and almost never in the same process. I guess, in the end, is I can't think of a way to write a simple test case for this, what are you thinking would be adequate?
Ted: Sorry for not responding for a while - I've been on vacation. Will merge the patch soon. Thanks!
Fix committed to master and glib-2-26. Thanks! http://git.gnome.org/browse/glib/commit/?id=09ce9dc542b26e133bc798f9a0382b642aea4470 http://git.gnome.org/browse/glib/commit/?h=glib-2-26&id=b2315084cb21a1ef072a48b0238a2e614af78be3
(In reply to comment #2) > The way that I was able to recreate it to find this bug is using a dbusmenu > test that sets up a client and a server with 5 proxies, which basically hammers > DBus. Even then it didn't happen every time, and almost never in the same > process. Would you mind sharing the source for this? An unreliable manual test would be better than nothing. I'll try to put together a stress-test that can reproduce this, now I know what the bug was (the project I'm working on has a certain amount of test-driven bureaucracy when committing bugfixes).
Sure, it's Open Source :-) The test though is probably relatively specific, I'm not sure it can be easily generalized. Here is the test setup in the Makefile: http://bazaar.launchpad.net/~dbusmenu-team/dbusmenu/trunk/view/head:/tests/Makefile.am#L321 And here's the client: http://bazaar.launchpad.net/~dbusmenu-team/dbusmenu/trunk/view/head:/tests/test-glib-proxy-client.c The proxy: http://bazaar.launchpad.net/~dbusmenu-team/dbusmenu/trunk/view/head:/tests/test-glib-proxy-proxy.c And server: http://bazaar.launchpad.net/~dbusmenu-team/dbusmenu/trunk/view/head:/tests/test-glib-proxy-server.c You'll also need dbus-test-runner to make it all work right: http://launchpad.net/dbus-test-runner