After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 627974 - Floating reference headaches
Floating reference headaches
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gvariant
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Allison Karlitskaya (desrt)
gtkdev
Depends on:
Blocks: 654394
 
 
Reported: 2010-08-25 18:29 UTC by David Zeuthen (not reading bugmail)
Modified: 2011-07-12 17:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GVariant: add g_variant_take_ref() (9.22 KB, patch)
2011-07-11 12:38 UTC, Allison Karlitskaya (desrt)
none Details | Review
Oops. Failed to properly drop the assert on g_value_take_variant(). (9.18 KB, patch)
2011-07-11 13:24 UTC, Allison Karlitskaya (desrt)
none Details | Review

Description David Zeuthen (not reading bugmail) 2010-08-25 18:29:55 UTC
Right now g_variant_is_floating() says that this function should only be used for debug purposes. However, consider a function that returns a GVariant. A very reasonable programming contract could state

  Returns: A #GVariant with the value for @property_name or %NULL if
      @error is set. If the returned #GVariant is floating, it is
      consumed - otherwise its reference count is decreased by one.

so the user can write code like this

 static GVariant *
 get_property (...)
 {
   GVariant *ret;
   LOCK_CACHE();
   ..
   if (cache_result != NULL)
     ret = g_variant_ref (cache_result);
   else
     ret = g_variant_new_string ("Happy Hacking");
   UNLOCK_CACHE();
   return ret;
 }

  (nitpicking: Specifically we need to return a reference because
   the reference from the cache in the example above might be
   invalidated from another thread. So it's not good enough to
   return a reference that the get_property() function keeps
   owning.)

However, the only way to implement such semantics, is code like this

 value = vtable->get_property (...)
 if (g_variant_is_floating (value))
   g_variant_ref_sink (value);
 /* use value */
 g_variant_unref (value);

but this is in violation with what g_variant_is_floating() says.

Second, a number of GVariant functions simply doesn't specify whether

 - the returned GVariant is floating or not
 - the floating reference is consumed

Ideally every @param and Returns: would mention exactly what is happening.

(Bonus chatter: Third, I'm not even sure how to express the Returns: statement above in a way so it works with gobject-introspection. But it's pretty damn clear we need that kind of semantics.)
Comment 1 David Zeuthen (not reading bugmail) 2010-08-25 18:51:20 UTC
See

 http://git.gnome.org/browse/glib/commit/?id=0d0a9bb4485069a56caf139346e6a6aad81c4efd

for the GDBus fix (using is_floating()).
Comment 2 Dan Winship 2010-08-25 19:56:51 UTC
Ew. Floating references. The original gnio IP address classes used floating references, "for convenience", and it quickly started running into problems, where, eg, if you passed a floating GInetAddress as an argument to g_resolver_lookup_by_address(), then it would get freed when the method returned (because GResolver internally created and then destroyed a GInetSocketAddress using the passed-in GInetAddress, and the GInetSocketAddress would ref_sink and then unref the GInetAddress).

AFAICT, floating references only work well when, for every floating object, there is exactly 1 other object that could ref_sink it, and it's completely obvious to everyone what that object is. Eg, for GtkWidget, a widget's parent ref_sinks it, and everything else uses plain ref (and GtkWindow has to ref_sink itself, since it has no parent).

But GInetAddress didn't have that property (a GInetAddress can be reused multiple times), and neither does GVariant; any GVariant could be used either as a "toplevel" object, or as a subpart of some other variant, and when you return a variant from a method, you don't know what the caller is planning to do with it.

So, IMHO the right fix is to not use floating references in GVariant...


OTOH, for your particular problem:

(In reply to comment #0)
>    if (cache_result != NULL)
>      ret = g_variant_ref (cache_result);
>    else
>      ret = g_variant_new_string ("Happy Hacking");

the easy fix would be to ref_sink ret in the else clause, and then the returned variant is always non-floating. And likewise, other GVariant APIs could follow the same rule: unless the method is a constructor, it always returns non-floating variants.

(I'm not sure there's an easy API convention for the g_resolver_lookup_by_address()-like case though. You just have to note in the documentation that the argument will be ref_sinked and then unreffed, and force the caller to deal appropriately (by preemptively ref_sinking it itself if it's floating).)

>  if (g_variant_is_floating (value))
>    g_variant_ref_sink (value);
> ...
> but this is in violation with what g_variant_is_floating() says.

FTR, note that gsettings does this in one place as well...
Comment 3 David Zeuthen (not reading bugmail) 2010-08-25 20:38:50 UTC
(In reply to comment #2)
> Ew. Floating references. 

Yeah, it's a bit ew. Although... my experience with floating refs and GVariant has mostly been good... except for when I had to look in the source for whether the returned reference was floating - for e.g. _get_normal_form() the returned instance is not floating... (this is maybe not surprising if you are the GVariant author or think really hard about it (since if already normalized it returns a ref()...)).

Anyway, I really just think GVariant just needs better docs on what each function exactly does.

> >  if (g_variant_is_floating (value))
> >    g_variant_ref_sink (value);
> > ...
> > but this is in violation with what g_variant_is_floating() says.
> 
> FTR, note that gsettings does this in one place as well...

Right, this one

 GVariant *  (*GSettingsBindSetMapping)  (const GValue *value,
                                          const GVariantType *expected_type,
                                          gpointer user_data);

with the docs

 Returns : a new GVariant holding the data from value, or
           NULL in case of an error

is similar in memory management to how GDBusInterfaceGetPropertyFunc now works. 

As such, should add "If the returned GVariant is floating, it is consumed - otherwise its reference count is decreased by one." to GSettingsBindSetMapping...
Comment 4 Matthias Clasen 2010-08-25 20:53:38 UTC
One way to avoid the icky is_floating check for the caller is to allow unreffing the floating variant, I guess.
Comment 5 David Zeuthen (not reading bugmail) 2010-08-25 21:04:55 UTC
(In reply to comment #0)
>   Returns: A #GVariant with the value for @property_name or %NULL if
>       @error is set. If the returned #GVariant is floating, it is
>       consumed - otherwise its reference count is decreased by one.

Btw, since this has confused both Matthias and Colin, I'd like to point out that this is documentation for a Virtual Function, e.g. the main audience is the _callee_ of this function (e.g. implementor), not the _caller_.

If it was a regular function, then the semantics would have been locked down, e.g. the docs would either state

 Returns: A #GVariant. Free with g_variant_unref().

or

 Returns: A floating #GVariant. Free with g_variant_unref().

so the user knows exactly what to do.
Comment 6 Benjamin Otte (Company) 2010-08-26 17:06:53 UTC
(In reply to comment #0)
>  static GVariant *
>  get_property (...)
>  {
>    GVariant *ret;
>    LOCK_CACHE();
>    ..
>    if (cache_result != NULL)
>      ret = g_variant_ref (cache_result);
>    else
>      ret = g_variant_new_string ("Happy Hacking");
>    UNLOCK_CACHE();
>    return ret;
>  }
> 
As Dan pointed out, a floating reference is not a reference you own. It's a reference that's owned by the object you are about to add your floating object to. So a function like this one is wrong and completely broken and shows that the person implementing it did not understand floating references.
The correct fix here is to either use g_variant_ref_sink (g_variant_new_string ("Happy Hacking")) or have an add_cache() function that g_variant_ref_sink()s the variant.

Why does GVariant use floating references? What is a new variant supposed to be added to directly after creation?
Comment 7 David Zeuthen (not reading bugmail) 2010-08-26 17:16:19 UTC
(In reply to comment #6)
> (In reply to comment #0)
> >  static GVariant *
> >  get_property (...)
> >  {
> >    GVariant *ret;
> >    LOCK_CACHE();
> >    ..
> >    if (cache_result != NULL)
> >      ret = g_variant_ref (cache_result);
> >    else
> >      ret = g_variant_new_string ("Happy Hacking");
> >    UNLOCK_CACHE();
> >    return ret;
> >  }
> > 
> As Dan pointed out, a floating reference is not a reference you own. It's a
> reference that's owned by the object you are about to add your floating object
> to. So a function like this one is wrong and completely broken and shows that
> the person implementing it did not understand floating references.

There's no need to insult me. That's just childish. Of course I know what a floating reference is.

> The correct fix here is to either use g_variant_ref_sink (g_variant_new_string
> ("Happy Hacking")) or have an add_cache() function that g_variant_ref_sink()s
> the variant.
> 
> Why does GVariant use floating references? What is a new variant supposed to be
> added to directly after creation?

I disagree. It's completely _reasonable_ to want these semantics because this is the _natural_ way to write code - especially by people unaware that GVariant uses floating references. Which I bet includes most programmers.

Listen, floating references really is just a hack for "C convenience" so being all holy about "no-one owns a floating reference" is not helping at all. Nor is requiring people to write

 return g_variant_ref_sink (g_variant_new_string ("blah"));

instead of the simpler

 return g_variant_new_string ("blah");
Comment 8 Benjamin Otte (Company) 2010-08-26 17:36:20 UTC
It's not reasonable to want the semantic "returns a floating reference or a real one", because that way the following very simple code does not work:

variant = get_property (...);
call_function_that_ref_sinks (variant);
/* unref variant now or not? */

Also, all bindings will need to do the if (is_floating) ref_sink() hack. Which just moves the problem from the person who wrote the function to the persons who use the function. And that is not a good solution.


And I'm being all holy about it, because getting something as fundamental as refcounting wrong in an API you don't intend to break again in the near future means that people unaware of these things will write broken code until almost forever. And then they get annoyed at glib being "stupid". And I'd like to avoid that.

It's reasonable to want that code to work, but then floating references are the wrong thing to use. Which is what I'm arguing.
Interestingly, this has not been a problem in GTK for a long time.
Comment 9 David Zeuthen (not reading bugmail) 2010-08-26 17:42:05 UTC
(In reply to comment #8)
> It's not reasonable to want the semantic "returns a floating reference or a
> real one", because that way the following very simple code does not work:
> 
> variant = get_property (...);
> call_function_that_ref_sinks (variant);
> /* unref variant now or not? */

Here's the thing. There is approximately _one_ function in the world that will call GDBusInterfaceGetPropertyFunc (which has these semantics). And that is the GDBus core. And that function has already been written and it works.

The thing is - it's perfectly fine to make it hard for that one function (the caller) if it means we can make it easy for the THOUSANDS of functions (the callees) that programmers out there will implement. It's a reasonable trade-off to make. I think it's the same for GSettingsBindSetMapping. Heck, Ryan and I *independently* came up with the idea that these semantics are a good idea.
Comment 10 Dan Winship 2010-08-26 18:02:27 UTC
(In reply to comment #9)
> The thing is - it's perfectly fine to make it hard for that one function (the
> caller) if it means we can make it easy for the THOUSANDS of functions (the
> callees) that programmers out there will implement.

Right. My worry is that floating references in GVariant, which are supposed to making people's lives easier (by allowing them to leave out g_variant_unref() calls in certain cases) are actually going to end up making them harder (by forcing them to add mysterious g_variant_ref_sink()s to get around "this function sometimes frees its inputs" behavior, like in this morning's bug 628046).
Comment 11 David Zeuthen (not reading bugmail) 2010-08-26 19:53:07 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > The thing is - it's perfectly fine to make it hard for that one function (the
> > caller) if it means we can make it easy for the THOUSANDS of functions (the
> > callees) that programmers out there will implement.
> 
> Right. My worry is that floating references in GVariant, which are supposed to
> making people's lives easier (by allowing them to leave out g_variant_unref()
> calls in certain cases) are actually going to end up making them harder (by
> forcing them to add mysterious g_variant_ref_sink()s to get around "this
> function sometimes frees its inputs" behavior, like in this morning's bug
> 628046).

Yes. I don't think it's that bad, however (or maybe I'm just used it). But, again, would be good to list the rules of thumbs in the GVariant intro blurb; perhaps something as simple as

 - only constructors (e.g g_variant_new() and g_variant_builder_end())
   returns floating references

 - functions taking a GVariant usually consumes it
   - e.g. g_variant_builder_add(), g_dbus_connection_call(),
     g_dbus_connection_emit_signal(), g_settings_set_value(),
     g_signal_emit()

 - callbacks (think signal handlers) with GVariant parameters don't
   get a floating reference

 - diagnostic functions usually don't consumes a GVariant
   - e.g. g_variant_print(), g_variant_is_floating()

since it helps both a) shape the mental image people have; but also b) acts as a guideline when developing new API using GVariant.

Anyway, a set of guidelines is one thing. I still think we need every public function / function pointer to specify exactly how the reference count is managed.

ps. : here's a real-world example of how floating references can be used in-line

 http://cgit.freedesktop.org/udisks/commit/?h=gdbus-port&id=bab39d31db875a7359cd3e02e7f57ce264db65da

As much as this is "elegant"... it also scares me.
Comment 12 Allison Karlitskaya (desrt) 2010-08-26 21:09:20 UTC
First: this bug is completely legit.

The situation with the GSettings stuff (history) is that for a very long time this function expected a floating reference to always be returned and then I realised that it would be useful to accept non-floating references in some cases.  I introduced the check as a dirty compatibility shim and nothing else.

I've told Christian in the past (when he was implementing the return-handling code for the GVariant c marshaller) that it's very broken for anything to depend on is_floating() for anything other than debugging/asserts.

The fact that the GSettings binding interface turned out OK and the fact that it would really be very much less convenient (for users) for me to *not* have this hack here is really making me rethink my initial desire for "purity".  Particularly since this purity really buys us nothing at all and is hurting our API users.

Basically, after reflecting on things a bit, I think the intention of this report is correct.  I even think we should change the marshaller to allow floating or non-floating return values from functions (and do the obviously-correct thing for both of the cases).

At the same time, hand-coding the is_floating() check really does seem wrong.  I think a way to do this might be to have an extra function:

dtrt(v) {
  if (is_floating (v))
    return sink(v);
  else
    return v;
}

then you would call callbacks like

GVariant *result = dtrt (the_callback ());

...

g_variant_unref (result);
Comment 13 Allison Karlitskaya (desrt) 2011-07-11 12:38:18 UTC
Created attachment 191707 [details] [review]
GVariant: add g_variant_take_ref()

This function implements the following logic:

  if (g_variant_is_floating (value))
    g_variant_ref_sink (value);

which is used for consuming the return value of callbacks that may or
may not return floating references.

This patch also replaces a few instances of the above code with the new
function (GSettings, GDBus) and lifts a long-standing restriction on the
use of floating values as the return value for signal handlers by
improving g_value_take_variant().
Comment 14 Allison Karlitskaya (desrt) 2011-07-11 13:24:19 UTC
Created attachment 191711 [details] [review]
Oops.  Failed to properly drop the assert on g_value_take_variant().
Comment 15 David Zeuthen (not reading bugmail) 2011-07-11 14:27:08 UTC
Looks good to me - and thanks for the detailed docs.