GNOME Bugzilla – Bug 564402
gda_holder_set_attribute() should copy the string
Last modified: 2008-12-18 10:58:22 UTC
The gda_holder_set_attribute() documentation http://library.gnome.org/devel/libgda/unstable/GdaHolder.html#gda-holder-set-attribute says that no copy of the string is made so it must exist for as long as the GdaHolder. That is incredibly awful. It's a major difficulty for language bindings, which typically dynamically allocate and create strings. gda_column_set_attribute(), gda_attributes_manager_set(), and gda_meta_table_column_set_attribute() have the same problem. The API can't possibly be released as stable while this problem remains.
This design is intended for the following reasons: * the name of the attribute is meant to be restricted to Libgda's defined ones (such as GDA_ATTRIBUTE_DESCRIPTION, ...) * making a copy of the attribute each time this function is used would introduce some major slowdowns and memory consumption. This behaviour is common to all the classes using GdaAttributesManager. I don't know about language bindings, but I don't see why the same limitations could not apply to the bindings. Anyway, I don't want to change that design now.
(In reply to comment #1) > This design is intended for the following reasons: > * the name of the attribute is meant to be restricted to Libgda's defined ones > (such as GDA_ATTRIBUTE_DESCRIPTION, ...) Then you should be using an enum to avoid this confusion. In C++ can wrap this as an enum, with a switch/case block, but that doesn't make the API any nicer for users of the C API. > * making a copy of the attribute each time this function is used would > introduce some major slowdowns and memory consumption. Why are you so sure of this? Just how much and how often are these used? This sounds a lot like premature optimiziation - with the usual pain for users of the API. > This behaviour is common to all the classes using GdaAttributesManager. > > I don't know about language bindings, but I don't see why the same limitations > could not apply to the bindings. I'm telling you, so now you should know. Higher-level languages typically alocate strings dynamically. A "static string" is a very C-specific thing. We could demand this in our C++ API (though not in Java or Python, I think) but I would be ashamed to provide an API that invites the application developer to write code that crashed. At the moment we are forced to leak the string that we provide to these C functions. That's a real (not theoretical) problem. > Anyway, I don't want to change that design now. This makes me want to revert to using libgda-3.0 instead.
> Then you should be using an enum to avoid this confusion. I had first thought about this but the problem was extensibility: the library users could not be allowed to use custom attribute values because of the potential enum conflicts (I need this extensibility for Libgnomedb). With strings, there is the "gda" kind of "namespace". > Why are you so sure of this? Just how much and how often are these used? This > sounds a lot like premature optimiziation - with the usual pain for users of > the API. It's used a lot. Some time ago I had to rewrite the internal implementation becuse the performances had become dreadfull... > I'm telling you, so now you should know. Higher-level languages typically > alocate strings dynamically. A "static string" is a very C-specific thing. The API does not require static strings (it does not compare pointers to strings but the strings themselves, like gtk_stock_lookup()), the strings can be dynamically allowated if needed, but they must exist for as long as the object for which they represent an attribute exists. If you want I can add an API specifically to make a copy of the attribute name, which you can wrap in C++ instead of the exiting one.
(In reply to comment #3) > > Then you should be using an enum to avoid this confusion. > > I had first thought about this but the problem was extensibility: the library > users could not be allowed to use custom attribute values because of the > potential enum conflicts (I need this extensibility for Libgnomedb). With > strings, there is the "gda" kind of "namespace". I don't see how you can allow non-standard strings while sticking to this requiring-static-strings-are-ok-because-only-a-fixed-set-of-strings-are-allowed logic, but OK. > > Why are you so sure of this? Just how much and how often are these used? This > > sounds a lot like premature optimiziation - with the usual pain for users of > > the API. > > It's used a lot. Some time ago I had to rewrite the internal implementation > becuse the performances had become dreadfull... Maybe you are using this internally a lot for something but maybe you shouldn't let that optimization leak out to the public API. > The API does not require static strings (it does not compare pointers to > strings but the strings themselves, like gtk_stock_lookup()), the strings can > be dynamically allowated if needed, but they must exist for as long as the > object for which they represent an attribute exists. That's effectively forever, because the lifetime cannot be easily predicted (nor is that documented). I guess there's a destroy callback people could connect to, but then you are forcing the API user to manage memory that the object should be managing itself. > If you want I can add an API specifically to make a copy of the attribute name, > which you can wrap in C++ instead of the exiting one. That would be wonderful, please. Thanks. I do suggest that you make that the default and rename the current function to gda_holder_set_attribute_static(), and maybe even remove it from the public header, so you can just use it internally, if possible.
> I don't see how you can allow non-standard strings while sticking to this > requiring-static-strings-are-ok-because-only-a-fixed-set-of-strings-are-allowed > logic, but OK. I think there is a documentation problem here: the "static" term is not the correct one (the string does not need to be in the DATA segment). I'll correct the documentation to make that clear. The only limitation is the same as when using a GHashTable when the has table is created using g_hash_table_new(g_str_hash,g_equal_hash): you have to make sure the key string exists as long as the hash table exists. > > I do suggest that you make that the default and rename the current function to > gda_holder_set_attribute_static(), and maybe even remove it from the public > header, so you can just use it internally, if possible. I propose to keep the current function as it is now, and to add: void gda_holder_set_attribute_full (GdaHolder *holder, const gchar *attribute, GDestroyNotify attribute_destroy, const GValue *value); (and apply similar change to other objects using the same kind of API). This would make it possible to specify an appropriate destroy function (g_free or xmlFree for example). This way the API can be similar to g_object_set_data()/g_object_set_data_full() when using strings as the 'data' argument, and the API's behaviour is not changed from what it currently is (which is important as otherwise users which don't take into account the changes will end up with memory leaks).
(In reply to comment #5) > > I don't see how you can allow non-standard strings while sticking to this > > requiring-static-strings-are-ok-because-only-a-fixed-set-of-strings-are-allowed > > logic, but OK. > > I think there is a documentation problem here: the "static" term is not the > correct one (the string does not need to be in the DATA segment). I'll correct > the documentation to make that clear. Still, you require the string to be valid effectively for ever. > The only limitation is the same as when using a GHashTable when the has table > is created using g_hash_table_new(g_str_hash,g_equal_hash): you have to make > sure the key string exists as long as the hash table exists. GHashTable is a very C-specific API that language bindings would never wrap, because they have their own equivalents for such a data-structure. > > I do suggest that you make that the default and rename the current function to > > gda_holder_set_attribute_static(), and maybe even remove it from the public > > header, so you can just use it internally, if possible. > > I propose to keep the current function as it is now, and to add: > > void > gda_holder_set_attribute_full (GdaHolder *holder, const gchar *attribute, > GDestroyNotify attribute_destroy, const GValue *value); > > (and apply similar change to other objects using the same kind of API). > > This would make it possible to specify an appropriate destroy function (g_free > or xmlFree for example). > > This way the API can be similar to g_object_set_data()/g_object_set_data_full() > when using strings as the 'data' argument, g_object_set_data() is also very C-specific, as is obvious because it sets a raw gpointer. It's generally not wrapped by language bindings and not used much by applications. > and the API's behaviour is not > changed from what it currently is (which is important as otherwise users which > don't take into account the changes will end up with memory leaks). Copying an input parameter and destroying it yourself later will not leak memory. On the contrary, it avoids accidental double-frees or the need for on-purpose leaks. But thanks. The new function will give language bindings a chance at least, though you can expect most of them to not notice the strangeness of the original function until someone reports a crash to them. I'm also quite sure that the G Introspection stuff has no plan to support this strangeness.
There is now a correction in rev #3267. I implemented what you proposed because after analysing the code more closely there is no way I can definitely describe when a string will not used anymore, so one either can use static strings, or allocated strings and pass a GDestroyNotify function pointer used when the string is to be freed. Can you check?
Thanks. We can use that. But it's rather odd. If you know when the string can be destroyed then why not just copy it inside that function and always delete it?
> But it's rather odd. If you know when the string can be destroyed then why not > just copy it inside that function and always delete it? Because most of the time Libgda uses static strings and this avoids a lot of memory waste.
(In reply to comment #9) > > But it's rather odd. If you know when the string can be destroyed then why not > > just copy it inside that function and always delete it? > > Because most of the time Libgda uses static strings and this avoids a lot of > memory waste. But for that you have the new gda_holder_set_attribute_static() function.
I prefer the user to handle how and when the string is allocated and let him specify how to free it if needed (gda_holder_set_attribute_static is a #define which uses gda_holder_set_attribute).
OK. I obviously won't persuade you more, so thanks for allowing us to avoid the leak. But just to illustrate how silly this API is, this is pretty much what we provide in libgdamm: static void on_set_attribute_destroy(gpointer data) { g_free((char*)data); } void Holder::set_attribute(const gchar* attribute, const GValue* value) { //gda_holder_set_attribute() needs us to copy the string and then tells //us when to free the string, instead of just copyig and freeing it itself: gchar* dup = g_strdup(attribute); gda_holder_set_attribute(gobj(), dup, value, &on_set_attribute_destroy); }