After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 640714 - Make it possible to freeze incoming method calls
Make it possible to freeze incoming method calls
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gdbus
unspecified
Other Linux
: High enhancement
: ---
Assigned To: David Zeuthen (not reading bugmail)
gtkdev
Depends on:
Blocks:
 
 
Reported: 2011-01-27 12:46 UTC by Mikkel Kamstrup Erlandsen
Modified: 2013-10-19 00:55 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Half-done patch (3.63 KB, patch)
2011-01-28 18:07 UTC, David Zeuthen (not reading bugmail)
none Details | Review

Description Mikkel Kamstrup Erlandsen 2011-01-27 12:46:44 UTC
I just investigated what seemed to be a weird race condition in my app when it was started up via DBus activation.

The app basically does this:

  app = g_application_new(appname)
  app.register();

  /* Exit if not main instance */
  if (app.is_remote())
    exit;

  /* set up DBus objects here ... */
  
  app.run ();

The party sending a DBus message to an object managed by appname will receive an error like "Error org.freedesktop.DBus.Error.UnknownMethod: No such interface `org.example.Iface' on object path /my/path" even though one would be setting up the right object in the section commented in the above snippet.

The reason for this is that g_application_register() does a sync DBus call which tracks down to g_dbus_connection_send_message_with_reply_sync() where GDBus spins a recursive mainloop; meaning that we start handling incoming messages already at the app.register() line.

I think this is a bug in g_dbus_connection_send_message_with_reply_sync() affecting pretty much all the _sync() variants of the GDBus methods. The way I believe dbus-glib used to handle this is by queuing up all (in- and out bound) messages while doing the blocking call and then not process those before we enter the mainloop the next time. Indeed such a behaviour would fix the described race condition (although break message ordering semantics, but that's what you get for doing sync call ;-)).
Comment 1 David Zeuthen (not reading bugmail) 2011-01-27 16:07:53 UTC
This is a design problem with GApplication (that I don't think can be fixed). See the comments in g_bus_own_name()

 http://library.gnome.org/devel/gio/unstable/gio-Owning-Bus-Names.html#g-bus-own-name

for why

 If you plan on exporting objects (using e.g.
 g_dbus_connection_register_object()), note that it is
 generally too late to export the objects in name_acquired_handler.
 Instead, you can do this in bus_acquired_handler since you are
 guaranteed that this will run before name is requested from the bus.

Btw, you can work around this problem, however, by acquiring the connection and exporting your objects on it before you tell GApplication to own a name.

Reassigning to GApplication.

(I actually remember complaining about this to the GApplication authors - my suggestion was that GApplication shouldn't invade on the normal D-Bus namespace. Instead, GApplication should instead own names of the form org.gtk.Private.<identifier>. meh.)
Comment 2 Colin Walters 2011-01-27 17:01:18 UTC
While we can debate the GApplication interface, the entire idea of it was that we could do the "is my app unique" in a race-free simple way.

How do you expect this would be changed in GApplication?  Deprecate it and introduce a replacement?  Honestly, I'd prefer some flag for synchronous GDBus calls to be "really synchronous" like libdbus is, i.e. don't process any other messages.  This *can* be implemented, no?
Comment 3 Allison Karlitskaya (desrt) 2011-01-27 17:03:18 UTC
This is a problem that impacts every single use of GDBus -- not just GApplication.

This even happens when using GDBus's own internal name acquiring code.  The problem is that the application doesn't know that it owns the name until DBus is already starting to send it messages at that name.

This is not a problem with libdbus-1, because the handler for the reply to the name ownership request blocks the mainloop and prevents the next message from being handled.

With GDBus, however, messages are received and dispatched in a worker thread.  There are lots of good reasons for doing it that way, but this thread does not wait for the processing of any name ownership request reply to be finished.  It will bounce messages that are directed at object paths that would be registered as a result.

This is why GApplication, internally, registers its own object path before requesting the name.  You could do that do to avoid this problem.
Comment 4 David Zeuthen (not reading bugmail) 2011-01-27 17:12:21 UTC
The only way this can be fixed is by having GApplication have the equivalent of a GBusAcquiredCallback.
Comment 5 David Zeuthen (not reading bugmail) 2011-01-27 17:15:31 UTC
Another solution is to make GApplication avoid invading on the D-Bus namespace (by prefixing org.gtk.Private.Whatever). Heck, it just happens to do so on Linux right now but that just seems to be an implementation detail (that Mikkel's app seems to be relying on).

Such a change a) might break some apps relying on this impl detail; and b) will increase IPC cost in terms of having the user acquire another name; but I don't think that should hold us back.

Pick your poison.
Comment 6 Colin Walters 2011-01-27 17:19:28 UTC
(In reply to comment #4)
> The only way this can be fixed is by having GApplication have the equivalent of
> a GBusAcquiredCallback.

No.  The entire point of the interface was that it's *simple* to write the "unique app".  If all of a sudden it has to be asynchronous it's much more of a pain in the ass for no reason, since as we've discussed before for 99% of apps there is absolutely nothing to do unless you know whether or not your app is already running.

And besides that, do you not agree that the current semantics of all the _sync calls running an arbitrary set of other callbacks is going to cause application authors porting from dbus-glib and libdbus a massive amount of grief?
Comment 7 David Zeuthen (not reading bugmail) 2011-01-27 17:24:49 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > The only way this can be fixed is by having GApplication have the equivalent of
> > a GBusAcquiredCallback.
> 
> No.  The entire point of the interface was that it's *simple* to write the
> "unique app".  If all of a sudden it has to be asynchronous it's much more of a
> pain in the ass for no reason, since as we've discussed before for 99% of apps
> there is absolutely nothing to do unless you know whether or not your app is
> already running.

Yes, I agree it's an unfortunate situation. I really wish we had addressed this before releasing GApplication into the wild.

Another solution, of course, is to have a way to freeze the dispatch of
incoming method calls. But that's goddamn ugly.

I still think we should do what I suggest in comment 5.

> And besides that, do you not agree that the current semantics of all the _sync
> calls running an arbitrary set of other callbacks is going to cause application
> authors porting from dbus-glib and libdbus a massive amount of grief?

Apart from name owning stuff, which is _already_ fixed by having a well-designed nice async helper (g_bus_own_name()), I don't think there are any problems, no.
Comment 8 David Zeuthen (not reading bugmail) 2011-01-27 17:27:47 UTC
>          What    |Removed                     |Added
>           Summary|gdbus synchronous calls     |GApplication racy because
>                  |process replies other than  |of GDBus sync call
>                  |the one intended            |semantics

Eh, bugzilla changed this back automatically when I chose "submit my changes anyway". I kinda agree, though so keeping it as is.
Comment 9 Colin Walters 2011-01-27 17:28:51 UTC
(In reply to comment #5)
> Another solution is to make GApplication avoid invading on the D-Bus namespace
> (by prefixing org.gtk.Private.Whatever).

A *goal* of GApplication was that on Linux applications have DBus names - we wanted to encourage that.  So what you're really suggesting is all applications have *two* dbus names?  Can we agree that's a huge, gross hack to work around our inability to make GApplication's use of GDBus synchronous?
Comment 10 David Zeuthen (not reading bugmail) 2011-01-27 17:31:23 UTC
> gdbus synchronous calls process replies other than the one intended 

Btw, I think this is wrong. When only a single thread is involved we definitely deliver things in order.

I am actually starting to wonder if the problem is that the bus sends method invocations to the app before the RequestName() reply. Mikkel, is it possible you can obtain a G_DBUS_DEBUG=all (or less noisy but still useful) trace when this occurs? Thanks!
Comment 11 David Zeuthen (not reading bugmail) 2011-01-27 17:34:50 UTC
(In reply to comment #9)
> (In reply to comment #5)
> > Another solution is to make GApplication avoid invading on the D-Bus namespace
> > (by prefixing org.gtk.Private.Whatever).
> 
> A *goal* of GApplication was that on Linux applications have DBus names - we
> wanted to encourage that.  So what you're really suggesting is all applications
> have *two* dbus names?  Can we agree that's a huge, gross hack to work around
> our inability to make GApplication's use of GDBus synchronous?

The main problem is that GApplication is making assumptions about GDBus that wasn't true. I'm just trying to come up with solutions to unbreak this mess... complaining about the way GDBus work is not helping.
Comment 12 David Zeuthen (not reading bugmail) 2011-01-27 17:41:31 UTC
Btw, I don't know how GApplication works in detail (I'm reading the code right now though) but GDBus should at least give you this guarantee for a single-threaded app (not counting the GDBus worker thread): If you avoid recursing your mainloop you won't get any method invocations after your sync call to RequestName() returns. That guarantee ought to be enough to do what you want, right?
Comment 13 David Zeuthen (not reading bugmail) 2011-01-27 17:49:45 UTC
(In reply to comment #12)
> Btw, I don't know how GApplication works in detail (I'm reading the code right
> now though) but GDBus should at least give you this guarantee for a
> single-threaded app (not counting the GDBus worker thread): If you avoid
> recursing your mainloop you won't get any method invocations after your sync
> call to RequestName() returns. That guarantee ought to be enough to do what you
> want, right?

Now I remember: the problem is that the worker thread still processes the incoming method call - since no-one has registered anything, it rejects it. The worker thread needs to do this because it needs to know what mainloop to deliver the incoming call to.

The only solution, to keep the same GApplication API, is to have a way to freeze the processing of incoming method calls. I guess it's easier to just add this than keep fighting.
Comment 14 Colin Walters 2011-01-27 17:52:19 UTC
(In reply to comment #13)

> The only solution, to keep the same GApplication API, is to have a way to
> freeze the processing of incoming method calls. I guess it's easier to just add
> this than keep fighting.

Maybe g_dbus_connection_send_message_sync_and_block_queue() ?  Or a flag on the call?
Comment 15 David Zeuthen (not reading bugmail) 2011-01-27 17:57:22 UTC
(In reply to comment #14)
> (In reply to comment #13)
> 
> > The only solution, to keep the same GApplication API, is to have a way to
> > freeze the processing of incoming method calls. I guess it's easier to just add
> > this than keep fighting.
> 
> Maybe g_dbus_connection_send_message_sync_and_block_queue() ?  Or a flag on the
> call?

Well, you'd need a method to thaw/unblock it so it works out better, symmetry-wise, to have to methods. That also makes it possible to freeze the dispatch without having to send a message. So I'm thinking of going with this

 g_dbus_connection_freeze_method_dispatch()
 g_dbus_connection_thaw_method_dispatch()
Comment 16 Colin Walters 2011-01-27 18:11:34 UTC
One thing we could consider is making this entirely specific to acquiring a bus name.  We've said elsewhere that the fix for messages-processed-before-objects-created is to export objects before g_bus_own_name() or equivalent, but if GDBus had this it could automatically use it for g_bus_own_name().
Comment 17 David Zeuthen (not reading bugmail) 2011-01-27 19:09:43 UTC
(In reply to comment #0)
> The reason for this is that g_application_register() does a sync DBus call
> which tracks down to g_dbus_connection_send_message_with_reply_sync() where
> GDBus spins a recursive mainloop; meaning that we start handling incoming
> messages already at the app.register() line.
> 
> I think this is a bug in g_dbus_connection_send_message_with_reply_sync()
> affecting pretty much all the _sync() variants of the GDBus methods. The way I
> believe dbus-glib used to handle this is by queuing up all (in- and out bound)
> messages while doing the blocking call and then not process those before we
> enter the mainloop the next time. Indeed such a behaviour would fix the
> described race condition (although break message ordering semantics, but that's
> what you get for doing sync call ;-)).

Just for the record, this part of your analysis is wrong. And (sorry) but I'm being pedantic about it because ORBit suffered from these problems (causing reentrancy headaches) and it's important to not have people believe that GDBus suffers from them.

The way g_dbus_connection_send_message_with_reply_sync() works is that it creates a _private_ GMainContext and spins in a loop in that. In particular, GDBus will _not_ run any other mainloops. So in that sense, GDBus is just as blocking as libdbus is _as long_ as you only use a single thread. E.g. the works exactly the same way in this case. See

 http://git.gnome.org/browse/glib/tree/gio/gdbusconnection.c?h=glib-2-26#n1973

In the event that you use multiple threads, we do dispatch incoming messages to other mainloops running in other threads while a thread is blocking in g_dbus_connection_send_message_with_reply_sync(). If you think about it, this is very natural behavior. See the GDBus threading test case where we verify that this actually works (four threads each calling send_message_with_reply_sync()).

The reason you are seeing this bug is explained in comment 13 - namely that we dispatch incoming method calls in the worker thread.
Comment 18 David Zeuthen (not reading bugmail) 2011-01-27 19:40:20 UTC
Thinking more about it: Suppose we did add _freeze_method_dispatch() and _thaw_method_dispatch(), how would GApplication actually use it?

Well, it would call freeze() in new() and thaw() in run(). Then GApplication would also need to say in the docs "if you want to register D-Bus do it between new() and run() with a GDBusConnection for a session bus". Which implies the programmer needs to get the GDBusConnection _sync(). Which we already tell the programmer is bad.

I think it would be a lot better if GApplication had a vfunc

 static void
 export_objects (GApplication *application,
                 GDBusConnection *connection)
 {
   /* export objects here */
 }

that programmers can override. Then this is called as part of g_application_new().
Comment 19 David Zeuthen (not reading bugmail) 2011-01-27 19:44:17 UTC
(In reply to comment #18)
(you probably want s/new/register/ in the entire comment I made since it seems that no code is run as part of g_application_new() (which is a good thing!). If this is so, the vfunc should even be a signal so people can run that code without subclassing the GApplication type.)
Comment 20 Mikkel Kamstrup Erlandsen 2011-01-27 19:46:36 UTC
(In reply to comment #5)
> Another solution is to make GApplication avoid invading on the D-Bus namespace
> (by prefixing org.gtk.Private.Whatever). Heck, it just happens to do so on
> Linux right now but that just seems to be an implementation detail (that
> Mikkel's app seems to be relying on).
> 
> Such a change a) might break some apps relying on this impl detail; and b) will
> increase IPC cost in terms of having the user acquire another name; but I don't
> think that should hold us back.

Yeah, I was playing around discovered that it grabs the exact name I passed to the GApplication constructor. To me it is highly convenient that it does it this way - it also leaves me the option to choose a different name if I want that. If the name was mangled I would not have the option to use the exact string as I passed it in.

Considering this I think the right choice for this particular issue is to make the current behaviour a documented behaviour as well.
Comment 21 Colin Walters 2011-01-27 19:52:30 UTC
(In reply to comment #18)
> Thinking more about it: Suppose we did add _freeze_method_dispatch() and
> _thaw_method_dispatch(), how would GApplication actually use it?
> 
> Well, it would call freeze() in new() and thaw() in run(). Then GApplication
> would also need to say in the docs "if you want to register D-Bus do it between
> new() and run() with a GDBusConnection for a session bus".

For objects associated with the claimed bus name, yes.  

> Which implies the
> programmer needs to get the GDBusConnection _sync(). Which we already tell the
> programmer is bad.

It's not at all bad since GApplication itself already got a connection, just like, you know, GtkApplication makes a connection to the X server *synchronously*, like gtk_init() does?

Seriously, for pretty much EVERY application, connecting to the session bus and the X server is NOT the bottleneck or even important.  Firefox hackers have been doing a lot of work looking at the dynamic linker, which you know is a lot of synchronous I/O.

I know you're really proud of GDBus synchronous capability, but it just *doesn't* matter for the initial connection.  We've talked about this like 3 times now.  You can change my mind if you come back with at least ONE application that could gain any kind of speed up by connecting to the session bus asynchronously that also uses gtk+.

> I think it would be a lot better if GApplication had a vfunc
> 
>  static void
>  export_objects (GApplication *application,
>                  GDBusConnection *connection)
>  {
>    /* export objects here */
>  }
> 
> that programmers can override. Then this is called as part of
> g_application_new().

This has a reverse problem; how does it apply to non-Linux platforms?
Comment 22 David Zeuthen (not reading bugmail) 2011-01-27 19:56:48 UTC
(In reply to comment #20)
> Considering this I think the right choice for this particular issue is to make
> the current behaviour a documented behaviour as well.

Yes. One thing I realize is that if GApplication does that (which it probably should), then it also need to document when it's a good time to add D-Bus objects. The freeze()/thaw() proposal would say "do it between register() and run()" while the vfunc/signal proposal would say "do it in the vfunc".

Interestingly enough, if GApplication specified "do it between register() and run()" then it would be hard to also implement using libdbus-1 unless libdbus-1 also have ways to freeze()/thaw() incoming method calls. This is because the app is free to run the mainloop between register() and run().
Comment 23 David Zeuthen (not reading bugmail) 2011-01-27 20:00:29 UTC
(In reply to comment #21)
> > Which implies the
> > programmer needs to get the GDBusConnection _sync(). Which we already tell the
> > programmer is bad.
> 
> It's not at all bad since GApplication itself already got a connection, just
> like, you know, GtkApplication makes a connection to the X server
> *synchronously*, like gtk_init() does?
> 
> Seriously, for pretty much EVERY application, connecting to the session bus and
> the X server is NOT the bottleneck or even important.  Firefox hackers have
> been doing a lot of work looking at the dynamic linker, which you know is a lot
> of synchronous I/O.
> 
> I know you're really proud of GDBus synchronous capability, but it just
> *doesn't* matter for the initial connection.  We've talked about this like 3
> times now.  You can change my mind if you come back with at least ONE
> application that could gain any kind of speed up by connecting to the session
> bus asynchronously that also uses gtk+.

Sure. But it still looks ugly. And, personally, I often grep code for _sync() to just get an idea if they're doing risky stuff. It breaks that.

> > I think it would be a lot better if GApplication had a vfunc
> > 
> >  static void
> >  export_objects (GApplication *application,
> >                  GDBusConnection *connection)
> >  {
> >    /* export objects here */
> >  }
> > 
> > that programmers can override. Then this is called as part of
> > g_application_new().
> 
> This has a reverse problem; how does it apply to non-Linux platforms?

I think it would be called with a NULL for the GDBusConnnection so apps only tested on Linux fail early.
Comment 24 Colin Walters 2011-01-27 20:00:35 UTC
(In reply to comment #19)
> (In reply to comment #18)
> (you probably want s/new/register/ in the entire comment I made since it seems
> that no code is run as part of g_application_new() (which is a good thing!). If
> this is so, the vfunc should even be a signal so people can run that code
> without subclassing the GApplication type.)

Can you reply to the suggestion of a modified _call_sync_and_quiesce() or _call_sync_hold_queue()?  The semantics are simply that after this function is called, the worker thread will pause.  The next function that mutates the connection, like g_dbus_connection_send_ will unblock it.

Then all we need to do is change gapplicationimpl-dbus.c to use this, and the application author is free to use GDBus (or not) before _run() as they wish in a race-free way.
Comment 25 Mikkel Kamstrup Erlandsen 2011-01-27 20:03:31 UTC
I like the freeze/thaw approach a lot since it also opens up for fiddling with
the connection for other purposes. It's also in line with exiting GObject
idioms, so bonus points there.

Regarding the export_objects() vfunc I believe you can already use the
"startup" signal in a similar way given that the thaw() happens after
GApplication emits that signal ("startup" is emitted from inside register()). But as David said between register() and run() would also be nice to document if we end up supporting that.

@David: I'll look into your G_DBUS_DEBUG=all request once I've had some sleep
:-)
Comment 26 David Zeuthen (not reading bugmail) 2011-01-27 20:04:34 UTC
(In reply to comment #24)
> Can you reply to the suggestion of a modified _call_sync_and_quiesce() or
> _call_sync_hold_queue()?  The semantics are simply that after this function is
> called, the worker thread will pause.  The next function that mutates the
> connection, like g_dbus_connection_send_ will unblock it.
> 
> Then all we need to do is change gapplicationimpl-dbus.c to use this, and the
> application author is free to use GDBus (or not) before _run() as they wish in
> a race-free way.

Wouldn't this break the case where an app needs to do IPC as part of exporting its objects? (not entirely impossible - consider a small applet talking to Rhythmbox to get your song library.)
Comment 27 David Zeuthen (not reading bugmail) 2011-01-27 20:09:32 UTC
(In reply to comment #25)
> I like the freeze/thaw approach a lot since it also opens up for fiddling with
> the connection for other purposes. It's also in line with exiting GObject
> idioms, so bonus points there.

You can still do that with the vfunc. In fact, in the vfunc you are handed the GDBusConnection so you don't have to have this dreaded hacky line of code

 connection = g_bus_get_sync (G_BUS_TYPE_SESSION, NULL, NULL);

which is wrong and ugly in at least three ways!

Actually freeze()/thaw() limits you from handling incoming method calls at all. Which might break some setups. For example, this includes agent-based setups where the service you are poking calls into you. PolicyKit agents are an example of that.

> 
> Regarding the export_objects() vfunc I believe you can already use the
> "startup" signal in a similar way given that the thaw() happens after
> GApplication emits that signal ("startup" is emitted from inside register()).
> But as David said between register() and run() would also be nice to document
> if we end up supporting that.

Bingo. So we can use ::startup for this! And we can make the GDBusConnection available via a getter().

> @David: I'll look into your G_DBUS_DEBUG=all request once I've had some sleep
> :-)

No need for that!
Comment 28 Colin Walters 2011-01-27 20:14:33 UTC
(In reply to comment #26)
> (In reply to comment #24)
> > Can you reply to the suggestion of a modified _call_sync_and_quiesce() or
> > _call_sync_hold_queue()?  The semantics are simply that after this function is
> > called, the worker thread will pause.  The next function that mutates the
> > connection, like g_dbus_connection_send_ will unblock it.
> > 
> > Then all we need to do is change gapplicationimpl-dbus.c to use this, and the
> > application author is free to use GDBus (or not) before _run() as they wish in
> > a race-free way.
> 
> Wouldn't this break the case where an app needs to do IPC as part of exporting
> its objects? (not entirely impossible - consider a small applet talking to
> Rhythmbox to get your song library.)

I'm not sure what your scenario here is, but I realized my suggestion has the opposite problem - if the app doesn't do any other GDBus stuff, and it just calls g_main_loop_run(), then it would never process GDBus messages.

I'm not sure here.  It's really a big change for GDBus to process the messages in the worker thread for applications that are coming from dbus-glib.  It's really good in some ways, but since GDBus is the first binding with this design, it's the first time we've had to think about race conditions like around name ownership.

I guess GApplication could queue an idle on the mainloop which would call g_dbus_connection_unquiesce()?  Implying we do need that function =(
Comment 29 Mikkel Kamstrup Erlandsen 2011-01-28 05:42:31 UTC
(In reply to comment #13)
> ...
> Now I remember: the problem is that the worker thread still processes the
> incoming method call - since no-one has registered anything, it rejects it. The
> worker thread needs to do this because it needs to know what mainloop to
> deliver the incoming call to.

I guess the surprising thing here, if I understand correctly, is that GDBus will process messages without ever hitting your mainloop. Sure, messages that go to objects you've registered will be held back, like in dbus-glib, but all others will be rejected "Return to sender. Address unknown. No such number. No such zone.".

Looking at distribute_method_call() in gdbusconnection.c it looks like the error case is the only codepath where we don't hit the mainloop. Whenever we can resolve the path we enter the mainloop to do the rest. Why not enter the mainloop under any circumstance and wait to resolve the path inside the mainloop? Seems like that would solve all our problems here - and be the least surprising for people in general?
Comment 30 David Zeuthen (not reading bugmail) 2011-01-28 13:50:32 UTC
(In reply to comment #29)
> (In reply to comment #13)
> > ...
> > Now I remember: the problem is that the worker thread still processes the
> > incoming method call - since no-one has registered anything, it rejects it. The
> > worker thread needs to do this because it needs to know what mainloop to
> > deliver the incoming call to.
> 
> I guess the surprising thing here, if I understand correctly, is that GDBus
> will process messages without ever hitting your mainloop. Sure, messages that
> go to objects you've registered will be held back, like in dbus-glib, but all
> others will be rejected "Return to sender. Address unknown. No such number. No
> such zone.".

That is correct.

> Looking at distribute_method_call() in gdbusconnection.c it looks like the
> error case is the only codepath where we don't hit the mainloop. Whenever we
> can resolve the path we enter the mainloop to do the rest. Why not enter the
> mainloop under any circumstance and wait to resolve the path inside the
> mainloop? Seems like that would solve all our problems here - and be the least
> surprising for people in general?

The problem with this is answering the question "what mainloop should we enter?".

The theme in GDBus is that whenever you do something async (async method call, incoming signals, incoming method invocations) we deliver the completion in the thread-default mainloop from where you requested the async operation.

Btw, having a library rely on the user-app to run a main-loop can make things weird and awkward - look at all libdbus-1 for all the problems that can cause you. In particular, it makes it hard to sensibly use libdbus-1 in a thread from a library. Since this is something we wanted to work with GDBus, this is why it works this way.

(Also: your suggestion it wouldn't solve all our problems since we would start dispatching as soon as the user hits the mainloop - which is probably unwanted for some apps. Just the same way a libdbus-1 based implementation would suffer from these problems.)
Comment 31 David Zeuthen (not reading bugmail) 2011-01-28 15:40:00 UTC
Btw, a problem with the vfunc/signal solution occurred to me while driving into work: you only want to export objects if you are the primary name owner.

I also thought of another solution: Before acquiring the name, GApplication can set up a filter function that removes (and queues up) all incoming messages intended for the name in question. When run() is called (or if we fail to acquire the name), the queue is reposted to the connection.

(OK, so this requires a new GDBusConnection method to repost messages. But that's easy to add and certainly no worse than filter functions.)
Comment 32 Colin Walters 2011-01-28 15:50:09 UTC
(In reply to comment #31)
> Btw, a problem with the vfunc/signal solution occurred to me while driving into
> work: you only want to export objects if you are the primary name owner.
> 
> I also thought of another solution: Before acquiring the name, GApplication can
> set up a filter function that removes (and queues up) all incoming messages
> intended for the name in question. When run() is called (or if we fail to
> acquire the name), the queue is reposted to the connection.

This really doesn't seem special to GApplication though.  It's a generic problem with GDBus and name ownership.  

Another possibility is a straightforward synchronous dbus name acquisition function that does all that stuff internally (actually, just the equivalent of dbus_bus_request_name() from libdbus-1).
Comment 33 David Zeuthen (not reading bugmail) 2011-01-28 16:00:25 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > Btw, a problem with the vfunc/signal solution occurred to me while driving into
> > work: you only want to export objects if you are the primary name owner.
> > 
> > I also thought of another solution: Before acquiring the name, GApplication can
> > set up a filter function that removes (and queues up) all incoming messages
> > intended for the name in question. When run() is called (or if we fail to
> > acquire the name), the queue is reposted to the connection.
> 
> This really doesn't seem special to GApplication though.  It's a generic
> problem with GDBus and name ownership.  

No. GDBus already provides suitable API (g_bus_own_name()) to prevent this. There is no "generic problem" here with GDBus, sorry. Only problems with brining assumptions from dbus-glib/libdbus over.

> Another possibility is a straightforward synchronous dbus name acquisition
> function that does all that stuff internally (actually, just the equivalent of
> dbus_bus_request_name() from libdbus-1).

As previously explained this will not work. The user can still run his mainloop between register() and new() - in fact, anything can happen because the app may user libraries doing weird/crazy stuff to establish the set of objects to export.
Comment 34 Colin Walters 2011-01-28 16:31:56 UTC
(In reply to comment #33)
>
> No. GDBus already provides suitable API (g_bus_own_name()) to prevent this.

Well, it ended up being basically special cased in that API via the "bus_acquired_handler".

> There is no "generic problem" here with GDBus, sorry. Only problems with
> brining assumptions from dbus-glib/libdbus over.

But it's synchronous, so we can't use it.

> > Another possibility is a straightforward synchronous dbus name acquisition
> > function that does all that stuff internally (actually, just the equivalent of
> > dbus_bus_request_name() from libdbus-1).
> 
> As previously explained this will not work. 

How is it different whether the code for filtering and reposting messages happens from GApplication's name acquisition or a g_bus_own_name_sync () API?

> The user can still run his mainloop
> between register() and new() - in fact, anything can happen because the app may
> user libraries doing weird/crazy stuff to establish the set of objects to
> export.

You mean between register() and run()?  Sure, if you run the main loop, I would expect this kind of thing to happen.  What I want is for _run() to just be a mostly direct wrapper around g_main_loop_run(), as it is now.

What libraries are we talking about?  Do you have any in mind?
Comment 35 David Zeuthen (not reading bugmail) 2011-01-28 16:49:58 UTC
(In reply to comment #34)
> > > Another possibility is a straightforward synchronous dbus name acquisition
> > > function that does all that stuff internally (actually, just the equivalent of
> > > dbus_bus_request_name() from libdbus-1).
> > 
> > As previously explained this will not work. 
> 
> How is it different whether the code for filtering and reposting messages
> happens from GApplication's name acquisition or a g_bus_own_name_sync () API?

What exactly do you mean with a g_bus_own_name_sync() API?

Here's the thing: due to the way GDBus works, you _need_ to export your objects before even requesting the name! A g_bus_own_name_sync() API (using the filter/repost mentioned above) could look like this (w/o error handling):

 cookie = g_bus_own_name_sync ("org.foo.SomeName", cancelable, &error);

 /* register objects here - it is safe run the mainloop etc. */

 g_bus_own_name_sync_start_serving (cookie);

or it could look like this

 static void on_name_acquired (...)
 {
   /* register objects here - it is safe run the mainloop etc. */
 }

 g_bus_own_name_sync ("org.foo.SomeName",
                      on_name_acquired, data,
                      destroy_notify,
                      cancelable,
                      &error);

which isn't really that nice in the first place. And it doesn't really handle acquiring/losing the name on the fly. And since we've already have a _much_ nicer asynchronous API that also work with acquiring/losing the name on the fly, I think it would be a step back to have something like g_bus_own_name_sync().

Now, GApplication is a pretty special case of g_bus_own_name() since you do not have to burden the app programmer with things like acquiring/losing name ownership.

(Side note: Maybe the difference between you and I is that I like code in callbacks and you like to have it all in main()! :-) )

> > The user can still run his mainloop
> > between register() and new() - in fact, anything can happen because the app may
> > user libraries doing weird/crazy stuff to establish the set of objects to
> > export.
> 
> You mean between register() and run()?

Yes, sorry.

>  Sure, if you run the main loop, I would
> expect this kind of thing to happen.  What I want is for _run() to just be a
> mostly direct wrapper around g_main_loop_run(), as it is now.
> 
> What libraries are we talking about?  Do you have any in mind?

No, but someone will do it sooner or later (and FTR I've seen lots of code "blocking" in a mainloop GtkDialog-style) and then wonder why their app fails 1% of the time due to the race. And since it's so hard to debug it will never get fixed. It's much better to avoid this problem in the first place.

Our core libraries need to be super robust.
Comment 36 David Zeuthen (not reading bugmail) 2011-01-28 18:07:25 UTC
Created attachment 179535 [details] [review]
Half-done patch

Here's a half-done patch that illustrates the idea.

(Because of how the D-Bus protocol works (bus rewrites well-known name to unique name) we have to hold _all_ incoming messages, not just messages intended for well-known name (e.g. GApplication's name-id). We probably had to do so anyway because clients may very well be invoking methods on the unique name, not the well-known name.)
Comment 37 David Zeuthen (not reading bugmail) 2011-01-28 18:15:43 UTC
Actually, this is turning into a bigger and bigger mess given the observation in comment 36 about well-known vs unique names.

Perhaps, at this point, I would recommend just adding to the GApplication docs that the application must export all of its D-Bus objects prior to calling g_application_register(). The down-side, of course, is that if there's already an existing instance then the process will export objects and then die. In reality I don't think that's a problem.
Comment 38 Mikkel Kamstrup Erlandsen 2011-01-29 07:09:09 UTC
(In reply to comment #37)
> ...
> Perhaps, at this point, I would recommend just adding to the GApplication docs
> that the application must export all of its D-Bus objects prior to calling
> g_application_register(). The down-side, of course, is that if there's already
> an existing instance then the process will export objects and then die. In
> reality I don't think that's a problem.

In that case GApplication would be pretty much useless for my particular purpose. The objects I set up are very expensive. If I had to refactor these objects to do the expensive stuff lazily or something, it would be massively easier to do all the DBus magic manually. Although that actually seems to be impossible with the current GDBus API...
Comment 39 David Zeuthen (not reading bugmail) 2011-01-29 13:52:43 UTC
(In reply to comment #38)
> (In reply to comment #37)
> > ...
> > Perhaps, at this point, I would recommend just adding to the GApplication docs
> > that the application must export all of its D-Bus objects prior to calling
> > g_application_register(). The down-side, of course, is that if there's already
> > an existing instance then the process will export objects and then die. In
> > reality I don't think that's a problem.
> 
> In that case GApplication would be pretty much useless for my particular
> purpose. The objects I set up are very expensive. If I had to refactor these
> objects to do the expensive stuff lazily or something, it would be massively
> easier to do all the DBus magic manually. Although that actually seems to be
> impossible with the current GDBus API...

You can easily check whether there's a name owner by invoking NameHasOwner() on the bus and exit if this is so.
Comment 40 Mikkel Kamstrup Erlandsen 2011-01-30 07:21:19 UTC
(In reply to comment #39)
> ...
> You can easily check whether there's a name owner by invoking NameHasOwner() on
> the bus and exit if this is so.

Doh :-) Amidst thinking about threads and gmaincontexts the obvious solution didn't cross my mind at all - thanks David ;-)
Comment 41 Colin Walters 2011-01-31 14:21:17 UTC
(
> You can easily check whether there's a name owner by invoking NameHasOwner() on
> the bus and exit if this is so.

True; but this doesn't help acquiring the name in a non-racy fashion.  Think about a GNOME 2 scenario where the user double clicks on the panel launcher icon for Firefox.
Comment 42 David Zeuthen (not reading bugmail) 2011-01-31 15:16:34 UTC
(In reply to comment #41)
> (
> > You can easily check whether there's a name owner by invoking NameHasOwner() on
> > the bus and exit if this is so.
> 
> True; but this doesn't help acquiring the name in a non-racy fashion.  Think
> about a GNOME 2 scenario where the user double clicks on the panel launcher
> icon for Firefox.

It would work just fine - it just means that both processes will export the objects (NameHasOwner() returns false for both so each process starts doing work to export objects) but only one will acquire the name (so one of the processes will have done work in vain).

(In fact, I think the only race this "optimization" is susceptible to, is the one where a name owner exits just after NameHasOwner() has returned true to the other process. IIRC D-Bus activation is also susceptible to this, so the difference is the same.)
Comment 43 Colin Walters 2011-01-31 16:23:08 UTC
We discussed this in person and came to the conclusion that:

1) GDBus would gain a way to hold incoming method calls, and to release the hold
2) GApplication would hold method calls before requesting its name, and release the hold in a high priority idle on the default main context

(Another extreme possibility is to refactor GDBus to only run multiple threads if it's needed which is apparently how ORBit behaves.  In other words, the first time we get an API call from a thread other than the default we migrate to a worker thread, but the complexity hit is insane, so we agreed it doesn't make sense to try that now)
Comment 44 Allison Karlitskaya (desrt) 2011-01-31 16:49:05 UTC
I'd like to point out that (other than the dubious claim of bad behaviour in terms of claiming the bus name the user requested), GApplication is really completely innocent here.  It behaves itself in the only way that it possibly can:

  - registers its own object path first
  - then requests the bus name
  - unregisters the object path, if it didn't work

Internally, GApplication is quite race-free in this regard.  I'm actually slightly annoyed that it's been implicated in this bug at all (unless someone can prove otherwise).

The problem comes when users of GApplication expect libdbus-1 behaviour with respect to what they see as a name acquisition (ie: registering a GApplication with a given id).

GDBus is not strictly buggy either.  I'm actually OK with what GDBus is doing.  It's certainly inconvenient for all users (GApplication not excluded), but it's reasonably well-documented and quite possible to work with if you do some extra steps.

If we want to make GDBus nicer for users, it should probably not involve changes that are specific to GApplication.  This should be a global improvement and it should be somewhat 'automatic' (in the sense that you get the expected behaviour by doing the most obvious thing, without extra tricks).

Mind you, GApplication is slightly unusual in that it does the RequestName call directly for itself.  If the nice 'automatic' behaviour only occurs when going through GDBus's normal name ownership routines, then I'd be happy to switch GApplication to use them (even though it would be somewhat difficult due to the async vs. sync impedance mismatch).

I don't think that the solution to this problem lies in removing the worker thread in single-threaded situations.  That actually worsens the problem somewhat: you get the behaviour that you want until some unrelated library in the same process does something in a thread, causing all of your assumptions to change.  It might be an interesting optimisation to have this single-thread behaviour, but it should only be seen as an optimisation and not a solution to any problems.
Comment 45 Allison Karlitskaya (desrt) 2011-01-31 16:59:48 UTC
btw: I didn't read David's patch until just now, so I didn't want to comment on the specific implementation before seeing what he had in mind.

My opinion of a "real" solution is quite simple (in terms of the API):

When a name is successfully acquired as a result of a call to g_bus_own_name() or equivalent, incoming messages to the new name are held until after the GBusNameAcquiredCallback has finished running.

That's it.

That gives the user a chance to setup their own objects before calls start being delivered (instead of the current situation where object paths need to be registered in advance even before we know if we can own the name).

From GApplication's standpoint it means that I could dispatch the "startup" signal from my GBusNameAcquiredCallback in order to give applications a chance to setup their own objects in a race-free way.
Comment 46 Allison Karlitskaya (desrt) 2011-01-31 17:04:25 UTC
note: if DBus doesn't tell us the original destination name (and from what I recall, it may not) then this may involve freezing all incoming method calls for the duration of that callback.
Comment 47 Colin Walters 2011-01-31 17:10:40 UTC
(In reply to comment #45)
> btw: I didn't read David's patch until just now, so I didn't want to comment on
> the specific implementation before seeing what he had in mind.
> 
> My opinion of a "real" solution is quite simple (in terms of the API):
> 
> When a name is successfully acquired as a result of a call to g_bus_own_name()
> or equivalent, incoming messages to the new name are held until after the
> GBusNameAcquiredCallback has finished running.

So you're suggesting then that actually g_bus_own_name() could lose the "bus_acquired_handler" argument conceptually?  (We can't remove it now obviously, but it would become redundant).
Comment 48 David Zeuthen (not reading bugmail) 2011-01-31 17:13:10 UTC
(In reply to comment #44)
> The problem comes when users of GApplication expect libdbus-1 behaviour with
> respect to what they see as a name acquisition (ie: registering a GApplication
> with a given id).

Exactly.

> If we want to make GDBus nicer for users, it should probably not involve
> changes that are specific to GApplication.  This should be a global improvement
> and it should be somewhat 'automatic' (in the sense that you get the expected
> behaviour by doing the most obvious thing, without extra tricks).

I actually disagree there's something wrong with GDBus (and slightly annoyed
that you are implying this). If you think about it, the g_bus_own_name() API
very strongly matches the "spirit" of how the message bus setup works, insofar
that it considers well-known names "unimportant". E.g. you can have 0, 1, 2,
..., N well-known names and you can acquire and lose the names whenever. These
names coming and going should definitely not interfere with whether your
process export any objects. That's exactly why GDBus *forces* a style on the
programmer that a) suggests exporting all objects when acquiring the bus
connection; and b) makes it easy to handle acquiring/losing the name.

If there's a problem then it's, IMO, that GApplication is written in a way so
the only reasonable thing is to expect libdbus-1 behavior (e.g. export objects
between register() and new()). Actually, an even bigger problem with 
Application is that it *nowhere mentions* anything about D-Bus at all.

> I don't think that the solution to this problem lies in removing the worker
> thread in single-threaded situations.  That actually worsens the problem

I agree.

Anyway, this is not a GDBus problem, so I'm reassigning back to GApplication. I
will continue working on the patch that I gave a snapshot of in comment 36.
Comment 49 David Zeuthen (not reading bugmail) 2011-01-31 17:14:13 UTC
(In reply to comment #45)
> btw: I didn't read David's patch until just now, so I didn't want to comment on
> the specific implementation before seeing what he had in mind.
> 
> My opinion of a "real" solution is quite simple (in terms of the API):
> 
> When a name is successfully acquired as a result of a call to g_bus_own_name()
> or equivalent, incoming messages to the new name are held until after the
> GBusNameAcquiredCallback has finished running.
> 
> That's it.
> 
> That gives the user a chance to setup their own objects before calls start
> being delivered (instead of the current situation where object paths need to be
> registered in advance even before we know if we can own the name).
> 
> From GApplication's standpoint it means that I could dispatch the "startup"
> signal from my GBusNameAcquiredCallback in order to give applications a chance
> to setup their own objects in a race-free way.

No.
Comment 50 David Zeuthen (not reading bugmail) 2011-01-31 17:18:39 UTC
(In reply to comment #49)
> (In reply to comment #45)
> > btw: I didn't read David's patch until just now, so I didn't want to comment on
> > the specific implementation before seeing what he had in mind.
> > 
> > My opinion of a "real" solution is quite simple (in terms of the API):
> > 
> > When a name is successfully acquired as a result of a call to g_bus_own_name()
> > or equivalent, incoming messages to the new name are held until after the
> > GBusNameAcquiredCallback has finished running.
> > 
> > That's it.
> > 
> > That gives the user a chance to setup their own objects before calls start
> > being delivered (instead of the current situation where object paths need to be
> > registered in advance even before we know if we can own the name).
> > 
> > From GApplication's standpoint it means that I could dispatch the "startup"
> > signal from my GBusNameAcquiredCallback in order to give applications a chance
> > to setup their own objects in a race-free way.
> 
> No.

(or to expand my rather obtuse (and somewhat rude) "No" comment: I don't want to conflate "bus acquired" with "name acquired" exactly because the former happens exactly once and the latter can happen many times.)
Comment 51 Allison Karlitskaya (desrt) 2011-01-31 18:00:04 UTC
The only change I'll make to GApplication (without changes to GDBus to allow me to do a better job) is to add documentation to explain the limitations implied by our use of GDBus and make it very clear what you have to do as the programmer to avoid the problem (ie: set your objects up before calling register).

I may also add some convenience API to make it nicer to get access to the GDBusConnection instead of having to acquire it for yourself (although this is made difficult again by the fact that this call would necessarily be blocking).

I'm definitely not willing to accept a patch of the nature that you started working on -- it's exceptionally hacky and implemented in entirely the wrong place.  If the goal is to form a queue of held-back GDBus messages then this should be done in GDBusConnection rather than some thing bolted onto the side of it.  I can't even imagine what the implications of two GApplications (or other people copying the idea) interacting with respect to each other might be in this situation...

(In reply to comment #48)
                               If you think about it, the g_bus_own_name() API
> very strongly matches the "spirit" of how the message bus setup works, insofar
> that it considers well-known names "unimportant". E.g. you can have 0, 1, 2,
> ..., N well-known names and you can acquire and lose the names whenever. These
> names coming and going should definitely not interfere with whether your
> process export any objects. That's exactly why GDBus *forces* a style on the
> programmer that a) suggests exporting all objects when acquiring the bus
> connection; and b) makes it easy to handle acquiring/losing the name.

Although I agree that what you're saying is true "if you think about it" and know a lot about the full features of name ownership on D-Bus, this is clearly not how the average programmer views DBus.  90% of use cases are this simple:

if (acquire_name() == success)
  {
    yay!
    register_my_objects();
    do_stuff();
    etc();
  }
else
  {
    uh oh :(
    exit ();
  }

An act of API pragmatism would go a long way in helping these users so that we wouldn't have to explain to them the finer points of how their simplified view (which is correct for their special case) isn't really how it works in general.

(In reply to comment #50)
> (or to expand my rather obtuse (and somewhat rude) "No" comment: I don't want
> to conflate "bus acquired" with "name acquired" exactly because the former
> happens exactly once and the latter can happen many times.)

I'm approaching this problem from the viewpoint of registering a bus name on an existing connection -- not the convenience wrapper that both acquires a connection and registers a name on it.  From where I'm looking, there is no bus_acquired callback so there is no confusion.

In any case, if the connection already exists then the bus_acquired callback is just done immediately -- which means that it is approximately equivalent to just having made the object registrations before attempting to acquire the name (which is really what we're ending up telling everyone to do anyway).


If your intention is to add a general-purpose g_dbus_connection_freeze_method_calls() API then I think that would go a long way to helping people who expect libdbus-1 style behaviour to avoid issues when using GDBus.  GApplication could make use of this API to implement the "register from startup()" policy I mentioned above (ie: only thawing the queue after the user's 'startup' function returned).

Otherwise, I think this is a dead bug.
Comment 52 Allison Karlitskaya (desrt) 2011-01-31 18:25:15 UTC
I did this commit for GApplication.

commit 9f8798170dba82b8d46de02ab46105cf61b41f87
Author: Ryan Lortie <desrt@desrt.ca>
Date:   Mon Jan 31 13:19:59 2011 -0500

    GApplication: add notes about GDBus architecture
    
    Some people are trying to write code that calls g_application_register()
    then checks to see if we became the primary name owner before exporting
    objects.  This sort of approach worked with libdbus-1 because method
    calls to the freshly-acquired name would not be dispatched until the
    application returned to the mainloop.  With GDBus, however, dispatches
    can occur at any time (including in the brief space between acquiring
    the name and actually registering the object).
    
    Add documentation to make it clear that you should not expect this to
    work.

The issue is closed there, as far as I'm concerned.


If we're going to implement the ability to freeze incoming messages on a GDBusConnection, we should do so (and open another bug so that GApplication can make use of this new feature).  If not, please close this bug now.
Comment 53 David Zeuthen (not reading bugmail) 2011-01-31 18:30:42 UTC
(In reply to comment #52)
> I did this commit for GApplication.

Looks good to me. People don't read docs, however, so we should probably make GApplication make use of future GDBus API like 

 g_dbus_connection_freeze_incoming_method_invocation()
 g_dbus_connection_thaw_incoming_method_invocation()

I'm going to repurpose this bug for that.
Comment 54 Colin Walters 2011-01-31 18:40:07 UTC
(In reply to comment #44)
>
> I don't think that the solution to this problem lies in removing the worker
> thread in single-threaded situations.  That actually worsens the problem
> somewhat: you get the behaviour that you want until some unrelated library in
> the same process does something in a thread, causing all of your assumptions to
> change.  It might be an interesting optimisation to have this single-thread
> behaviour, but it should only be seen as an optimisation and not a solution to
> any problems.

Well...the argument here is that it seems to me unlikely that an application uses GDBus from a non-main thread in early startup, either before they export objects or while they're still doing so.  And if we found one, well we could just tell them that threaded GDBus use introduces a side effect.

Note actually I just realized that there is a simpler thing than refactoring GDBus to use only one thread; it could have a "default dispatch context" instead.  Then GApplication could do:

g_dbus_set_default_dispatch_context (connection, NULL);

The idea here is that GDBus uses that until a request on a different context comes in.  When it does, it switches back to dispatching in the worker thread context.
Comment 55 Allison Karlitskaya (desrt) 2011-02-02 17:27:18 UTC
Another similar idea I had with Mikkel and just discussed on IRC is this:

What this whole problem reduces to is a very specific class of message: those that generate automated error replies from GDBus saying "unknown object path".  We could deal with those in a better way:

Instead of immediately returning the error message from the worker thread, we could send it to the global default mainloop.  The error would be scheduled to dispatch from there instead.  No changes so far (except a slight delay added for returning error messages -- I doubt anybody loses any sleep over this).

Now here's the change: before we actually send the error message, we do a double-check.  If the target object has been created in the meantime then we are now able to skip the error and dispatch the message in the usual way.

That means that the code example given in the original report would play out this way:

 - register() is called.  We get the name.
 - D-Bus starts sending us messages
 - the object path doesn't exist yet, so we *queue* an error message for
   dispatch from the global default mainloop
 - back in the main thread, register() is returned and the user registers
   some object paths
 - now the mainloop runs, and the error message dispatch is sitting there
   already; but we can see that the double-check will succeed because by
   this point the user's object has been registered
 - profit

Essentially, at the cost of delaying error messages for a short time, we gain the advantage of making safe a whole class of (natural) behaviour for a very wide range of users (ie: anybody working in the main thread).

It also means that we don't need to introduce any freezing or thawing APIs.

It's also more robust than the solution proposed in comment #54 because there is no side effects introduced by a change-over point.  The approach I'm proposing would continue to work after program initialisation (as long as you do it from the main thread).

It also means that absolutely no rewriting of code would need to occur anywhere (ie: no need to move object registration into the GApplication 'startup' handler).

Really, the only disadvantage here is that we slightly delay the sending of an error message (for the chance to turn it into a non-error).
Comment 56 Colin Walters 2011-04-07 17:37:53 UTC
I took this one up to the dbus list: http://lists.freedesktop.org/archives/dbus/2011-April/014299.html
Comment 57 Simon McVittie 2011-10-04 10:30:14 UTC
(In reply to comment #53)
> (In reply to comment #52)
> > I did this commit for GApplication.
> 
> Looks good to me. People don't read docs, however, so we should probably make
> GApplication make use of future GDBus API like 
> 
>  g_dbus_connection_freeze_incoming_method_invocation()
>  g_dbus_connection_thaw_incoming_method_invocation()
> 
> I'm going to repurpose this bug for that.

Note that if you freeze only a subset of D-Bus messages, you've just removed D-Bus' "no re-ordering" guarantee, for instance (as in this case) by promoting signals/replies ahead of method calls.

Parts of Telepathy, at least, rely on this not being done (although admittedly they might only need this after application startup).
Comment 58 David Zeuthen (not reading bugmail) 2011-10-04 11:12:05 UTC
(In reply to comment #57)
> (In reply to comment #53)
> > (In reply to comment #52)
> > > I did this commit for GApplication.
> > 
> > Looks good to me. People don't read docs, however, so we should probably make
> > GApplication make use of future GDBus API like 
> > 
> >  g_dbus_connection_freeze_incoming_method_invocation()
> >  g_dbus_connection_thaw_incoming_method_invocation()
> > 
> > I'm going to repurpose this bug for that.
> 
> Note that if you freeze only a subset of D-Bus messages, you've just removed
> D-Bus' "no re-ordering" guarantee, for instance (as in this case) by promoting
> signals/replies ahead of method calls.

That's more a libdbus guarantee isn't it? And it doesn't apply if you are using blocking method calls, neither does it apply for gdbus in a single-threaded environment, e.g. g_dbus_connection_send_message_with_reply_sync() will return the method-reply message and hold e.g. signals received until the mainloop gets to run.

This is why some libraries have functions like this

 http://people.freedesktop.org/~david/udisks2-20110929/UDisksClient.html#udisks-client-settle

that basically does

  while (g_main_context_iteration (client->context, FALSE /* may_block */))
    ;

For example, if you do

 udisks_filesystem_mount_sync (fs, ...);

then you need to run settle() in order for the Filesystem::MountPoints property to be updated - but if you do it you are guaranteed that it is always actually updated to the mount point where the fs was mounted. This is because the ::PropertiesChanged signal is received before the method reply (the GDBus service-side uses g_dbus_interface_skeleton_flush() to that this is so).

On the other hand, if you do your sync call in a separate thread then the signals are delivered to other threads before the method reply is received.

I think that as long as you have an API with sync calls then you cannot have any ordering guarantees.

We could have something like g_dbus_connection_settle_sync (and also an async version) that blocks the calling thread until all received messages (at the point of calling settle()) have been delivered. But I'm not sure it's generally useful...
Comment 59 Simon McVittie 2011-10-04 12:09:58 UTC
(In reply to comment #58)
> (In reply to comment #57)
> > Note that if you freeze only a subset of D-Bus messages, you've just removed
> > D-Bus' "no re-ordering" guarantee
> 
> That's more a libdbus guarantee isn't it?

At the IPC level (i.e. across process boundaries), it's a guarantee that dbus-daemon provides. Conceptually, there's a single stream of messages, consisting of the streams of messages from each sender (in the order they were sent), interleaved in an arbitrary-but-consistent way. Something like dbus-monitor, with a catch-all match rule and eavesdropping, receives the entire merged stream; normal receivers get some subset of that stream according to their match rules, in the same order that dbus-monitor would have seen.

I think it's a valuable guarantee to have: in my experience, developers typically assume that total ordering applies. (Telepathy Tubes, when used over link-local multicast XMPP, only provide causal ordering - which is feasible to implement over multicast, but turns out to be relatively difficult to explain and understand.)

The client-library part of libdbus maintains the total ordering provided to it by dbus-daemon, unless you use an _and_block-style call. I believe GDBus is currently equivalent in that respect?

The D-Bus Specification doesn't currently document that a total ordering is provided, which I think is a bug.

> I think that as long as you have an API with sync calls then you cannot have
> any ordering guarantees.

Right, but with libdbus-style sync calls (what I call "pseudo-blocking" in <http://smcv.pseudorandom.co.uk/2008/11/nonblocking/>), the reply to that single method call "jumps the queue" and nothing else is re-ordered; so using the _sync method is effectively saying "I might see this reply earlier in the sequence than I ought to, and I'm OK with that". I believe GDBus' use of a private GMainContext is functionally equivalent to libdbus' pseudo-blocking.

Contrast this with freezing large-but-not-universal equivalence classes of messages, in which case the re-ordering extends further (to independent modules, potentially).
Comment 60 Allison Karlitskaya (desrt) 2013-10-19 00:55:31 UTC
It seems that we decided to do nothing about this beyond the GApplication docs change.  If anyone else disagrees, please reopen -- with a patch. :)