After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 726259 - service-side: replies from standard or unimplemented interfaces "jump the queue"
service-side: replies from standard or unimplemented interfaces "jump the queue"
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gdbus
2.39.x
Other Linux
: Normal normal
: ---
Assigned To: David Zeuthen (not reading bugmail)
gtkdev
Depends on:
Blocks:
 
 
Reported: 2014-03-13 16:46 UTC by Simon McVittie
Modified: 2018-05-24 16:21 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Simon McVittie 2014-03-13 16:46:38 UTC
To reproduce:

service-side
* have an object that exports interface A, but not interface B
* implement the method A.M to return a result as quickly as possible

client-side
* call A.M
* call B.M2 or Introspectable.Introspect()

Expected result:

* reply from A.M is received before error reply from B.M2
  or successful reply from Introspectable.Introspect()

Actual result:

* reply from B.M2 or Introspect "jumps the queue" and is sent immediately,
  whereas A.M is processed in an idle

The standard Peer interface behaves the same, but that's not necessarily unexpected, because Peer is already weird.

If the object does not reimplement Properties, bits of it behave similarly.

If GDBus consistently used schedule_method_call() to implement every method call, then the order would remain consistent. I think consistency is more important than optimization for these interfaces, which are not fast-path stuff - the slower path has to be taken anyway for the actually-useful code, like calls to methods that have a non-trivial implementation, or getting the properties of interfaces that have them.

This is pretty annoying in Telepathy, which I'm currently trying to port to GDBus. Our regression tests sometimes make a dummy method call which propagates through the client stack, across dbus-daemon, through the service stack and back to the client, so that they can assert that every message that should have arrived has arrived, and messages that should not have arrived have not. In telepathy-glib, we used Introspect() for this; similarly, in connection managers and Mission Control, we used calls to interfaces that don't exist. In both cases, dbus-glib went through the same dispatching sequence either way, but because GDBus "jumps the queue", we will now have to use a separate, appropriately-chosen dummy method for each object.
Comment 1 David Zeuthen (not reading bugmail) 2014-03-13 21:37:44 UTC
I absolutely think it makes sense to fix this as having Peer.Ping() work as a barrier is indeed a desirable property.
Comment 2 Simon McVittie 2014-03-14 12:09:29 UTC
(In reply to comment #1)
> I absolutely think it makes sense to fix this as having Peer.Ping() work as a
> barrier is indeed a desirable property.

I'll try to put a patch together. Do you think there's any chance it would be OK for GLib 2.40, if I can get it written promptly?

(If it would be, I'll do my best; if not, then I'm afraid working around it in Telepathy is a higher priority for me, because I want to get from "tests pass with dbus-glib" to "tests pass in minimum viable port to GDBus" with a minimum of brokenness in between.)

Note that Peer.Ping doesn't work as a barrier in at least dbus-glib, because it doesn't pass through the same one-message-per-main-loop-iteration queue that other messages do.

I think it's OK for Peer to be special, but I would really like unimplemented interfaces, Properties Introspectable to not be special in the same way.
Comment 3 Simon McVittie 2014-03-17 14:40:24 UTC
Unfortunately, it's easy for me to say "it should happen in an idle", but which main context should host that idle? It's the main context that was the thread-default at the time g_dbus_connection_register_object() was called - but that's per-interface, so a pathologically complicated object may have registered different interfaces to be called in different main-contexts. Similarly, the global default (NULL) main context might not even get dispatched, if the main thread is just blocking (or is a foreign event system, e.g. hosting GLib in an Android application, which I've done before).

If the object specifically implements Introspectable, then it behaves just like an ordinary interface. However, that's annoying to do, because it requires reimplementing GDBus' logic for introspection.

Similarly, if the object specifically implements Peer, then it behaves just like an ordinary interface. This is unlike libdbus, which specifically prevented object implementations from implementing Peer methods.

(Slight bug here: GDBus' implementation of Introspect() assumes that every object that reimplements Peer will reimplement Introspectable, and vice versa.)

Peer is not rocket science to implement, so it might be a good candidate for Telepathy to reimplement so that we have a nice barrier. For now, Telepathy still links to libdbus, so we could use dbus_get_local_machine_id(), but we'll want to stop doing that eventually. Would you be willing to consider making _g_dbus_get_machine_id() public in future?

Properties is a little more complicated. If the object specifically implements it (as most Telepathy objects do, due to dbus-glib design flaws, although long-term we'd like to stop that) then it behaves like an ordinary interface[1]. If not, then "no such interface" for unimplemented interfaces is immediate[2] (because it can't know which main context is right), and other errors are immediate[3], but calls to vtable members are scheduled in an appropriate main-context[4].


At the moment, my barrier implementation is to examine the client-side TpProxy to guess an interface that the service-side implementation ought to implement, then call GetAll(). I'm relying on the fact that the service-side object either implements Properties directly, in which case [1] saves us, or implements that interface, in which case [4] saves us. [2] is avoided by choosing an implemented interface, and [3] is avoided by the fact that GetAll doesn't have any other error conditions (it silently ignores unreadable properties).

I think the only part of this bug with an obvious concrete solution is: when validate_and_maybe_schedule_property_getset() raises an error (marked [3] above), it should defer it to the main context.
Comment 4 David Zeuthen (not reading bugmail) 2014-03-18 01:05:19 UTC
> Would you be willing to consider making g_dbus_get_machine_id() public in future?

Sure, that seems reasonable

Your points about "what GMainContext to dispatch to" are spot on - I didn't think it through.

I think it's reasonable to just say that it's undefined which GMainContext will be chosen but guarantee that it will be one of the GMainContext's used with register_object() for said path.

In practice we'd just pick the first hash-table entry of ExportedObject->map_if_name_to_ei - or if we want determinism, we pick the one whose name is lexicographically first.

ExportedSubtree is a bit easier because there's always only one GMainContext for an object path.

I *think* this is good enough, at least it's not worse than what we have today.
Comment 5 GNOME Infrastructure Team 2018-05-24 16:21:41 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/844.