GNOME Bugzilla – Bug 589887
g_memory_input_stream_new_from_data() should take user_data
Last modified: 2014-02-12 14:41:12 UTC
g_memory_input_stream_new_from_data() takes a GDestroyNotify callback parameter, but it takes no user_data parameter that would be provided when calling that callback. This makes if impossible for language bindings to wrap this properly. The callback is actually called with the data buffer instead. That's convenient for C coders who don't want to keep any additional state, I suppose. So if a user_data parameter was added then data could be used if user_data is NULL.
How do you propose to solve this without breaking API? new_from_data_full()?
Murray, can you address the question Ryan raised in comment #1? Is API breakage an issue, now that we're heading for GNOME 3?
Glib is not going to break API.
Murray, I'm closing this bug report as no further information has been provided. Please feel free to reopen this bug if you can provide the information asked for. Thanks!
Sorry for not replying before now. Why can't we just add a new function and deprecate the old one, as Ryan suggests.
Murray: can you please be more specific about this bug? What bindings are affected worst by this, for example? Of course, the most likely way for this to move along is with a patch.... :)
This affects the C++ bindings. It will affect many bindings, though they might not all realize it yet. OK, I'll try to get this done.
murray; thanks. fwiw, my gut says to call it _full() and skip the deprecation.
Not sure what to do with this one. The GDestroyNotify in this case is actually used to free the data of the chunk of memory... Consider... how is g_hash_table_new_full() properly mapped to language bindings ? This case is similar to g_hash_table_new() except that each chunk of memory can be freed with a separate GDestroyNotify function, g_memory_input_stream_new_from_data simply creates a memory input stream and adds the first chunk of data. (In reply to comment #0) > The callback is actually called with the data buffer instead. That's convenient > for C coders who don't want to keep any additional state, I suppose. So if a > user_data parameter was added then data could be used if user_data is NULL. Not sure I follow... What you want is to add additional state to each chunk of memory ? For instance... you have a C++ object that wraps a portion of memory and you want the sequential chunks of memory to get read from the "stream", not the object pointers/handles ? If we follow your suggestion, i.e. call it with the user_data provided for the chunk instead of calling it with the data itself... then the api gets a little convoluted... it means the caller _must_ hold on to the data from the user_data that it provides in order to also free the data along with the auxiliary user_data pointer (sounds like something people could get wrong more easily). If I'm following correctly... maybe a more comprehensible api would be: GInputStream * g_memory_input_stream_new_from_data_full (const void *data, gssize len, GDestroyNotify destroy, gpointer user_data, GDestroyNotify user_data_destroy); And: void g_memory_input_stream_add_data_full (GMemoryInputStream *stream, const void *data, gssize len, GDestroyNotify destroy, gpointer user_data, GDestroyNotify user_data_destroy); So... with an api like this... language bindings get to attach the life cycle of these memory chunks to the life cycle of the input stream... while the actual data read from the input stream is not the object wrapper itself but some internal buffer held by the object wrapper... Sounds about correct ?
(In reply to comment #9) > Not sure what to do with this one. > > The GDestroyNotify in this case is actually used to > free the data of the chunk of memory... > > Consider... how is g_hash_table_new_full() properly mapped to language > bindings ? I guess it isn't. Like g_memory_input_stream_new_from_data() it doesn't provide a way to send any additional state with the callback. But GHashTable is a rather C-specific thing that you wouldn't expect language bindings to wrap anyway. > Not sure I follow... What you want is to add additional state to each chunk > of memory ? > > For instance... you have a C++ object that wraps a portion of memory > and you want the sequential chunks of memory to get read from the "stream", > not the object pointers/handles ? > > If we follow your suggestion, i.e. call it with the user_data provided > for the chunk instead of calling it with the data itself... then the api > gets a little convoluted... it means the caller _must_ hold on to the > data from the user_data that it provides in order to also free the data > along with the auxiliary user_data pointer (sounds like something people > could get wrong more easily). Yes, well coding is akward in C. But it's an additional function so not everyone would have to worry about it. C coders would be free to not worry about it, though they probably should. > If I'm following correctly... maybe a more comprehensible api would be: > > GInputStream * > g_memory_input_stream_new_from_data_full (const void *data, > gssize len, > GDestroyNotify destroy, > gpointer user_data, > GDestroyNotify user_data_destroy); > > And: > > void > g_memory_input_stream_add_data_full (GMemoryInputStream *stream, > const void *data, > gssize len, > GDestroyNotify destroy, > gpointer user_data, > GDestroyNotify user_data_destroy); > > So... with an api like this... language bindings get to attach the life > cycle of these memory chunks to the life cycle of the input stream... > while the actual data read from the input stream is not the object wrapper > itself but some internal buffer held by the object wrapper... > > Sounds about correct ? Yes. Note that there's no question of user_data being used directly as a buffer. It's simply additional state, just like in a regular callback. We do this in many similar places. For instance: http://library.gnome.org/devel/gtk/unstable/GtkCellLayout.html#gtk-cell-layout-set-cell-data-func
Created attachment 182120 [details] [review] Add g_memory_input_stream_new_from_data_full/add_data_full() apis This patch adds an extra state pointer and an extra GDestroyNotify function to the 'Chunk' definition... allowing bindings to attach some extra state to memory chunks (to get memory management correctly from language bindings).
(In reply to comment #10) [...] > > If we follow your suggestion, i.e. call it with the user_data provided > > for the chunk instead of calling it with the data itself... then the api > > gets a little convoluted... it means the caller _must_ hold on to the > > data from the user_data that it provides in order to also free the data > > along with the auxiliary user_data pointer (sounds like something people > > could get wrong more easily). > > Yes, well coding is akward in C. But it's an additional function so not > everyone would have to worry about it. C coders would be free to not worry > about it, though they probably should. I see, it can be useful in C if say... segment of memory cannot simply be freed by a custom function... or one wants to use a generic free function with some state describing how the data should be freed (if for instance each actual memory chunk was an internal buffer of some Chunk objects the programmer created). The main difference from hash tables here, as far as I can see... is that the data added to the stream has to actually appear as contiguous memory (if it were a hash table, you can always store the pointer to the object that owns the memory directly... with an input stream, if you want an object to own a chunk of data, you need an auxiliary pointer for it in order to balance reference counts and clean up).
(In reply to comment #11) > Created an attachment (id=182120) [details] [review] > Add g_memory_input_stream_new_from_data_full/add_data_full() apis > > This patch adds an extra state pointer and an extra GDestroyNotify function > to the 'Chunk' definition... allowing bindings to attach some extra state > to memory chunks (to get memory management correctly from language bindings). So, can we push this?
Review of attachment 182120 [details] [review]: ::: gio/gmemoryinputstream.c @@ +151,3 @@ chunk->destroy (chunk->data); + if (chunk->user_data_destroy && chunk->user_data) Testing chunk->user_data is wrong, it should be valid to pass a GDestroyNotify but a NULL user data. @@ +240,3 @@ + * the life cycle of data chunks from language bindings). + * + * Returns: new #GInputStream read from @data of @len bytes. Needs a Since: tag. @@ +247,3 @@ + GDestroyNotify destroy, + gpointer user_data, + GDestroyNotify user_data_destroy) git grep GDestroyNotify -- '*.h' in glib shows that there are no other instances of "user_data_destroy" as a name. Can you use one of the existing i.e. either "destroy" or "notify"?
Created attachment 196707 [details] [review] 0001-Add-g_memory_input_stream_new_from_data_full-add_dat.patch > Testing chunk->user_data is wrong, it should be valid to pass a GDestroyNotify but a NULL user data. Done. > Needs a Since: tag. Done, for both new functions. > there are no other instances of "user_data_destroy" as a name. Can you use one of the existing i.e. either "destroy" or "notify"? user_data_destroy seems more self-documenting. "notify" would be rather obscure in the context. "user_data_destroy" seems fairly consistent with existing APIs, though they use "_func" which would be inconsistent with the existing parameter names. For instance: GHashTable* g_hash_table_new_full (GHashFunc hash_func, GEqualFunc key_equal_func, GDestroyNotify key_destroy_func, GDestroyNotify value_destroy_func) gulong g_cancellable_connect (GCancellable *cancellable, GCallback callback, gpointer data, GDestroyNotify data_destroy_func); guint g_dbus_connection_signal_subscribe (GDBusConnection *connection, const gchar *sender, const gchar *interface_name, const gchar *member, const gchar *object_path, const gchar *arg0, GDBusSignalFlags flags, GDBusSignalCallback callback, gpointer user_data, GDestroyNotify user_data_free_func);
(In reply to comment #15) > user_data_destroy seems more self-documenting. Sure; we're not really consistent here in the tree, if you like the name let's just move on. Good to commit I think.
Thanks. Pushed to master.
Actually, I have reverted it because I'm not sure that it's the right API now that I try to use it. This does nothing much more than calling user_data_destroy(user_data) after calling destroy(). That still leaves us without the user_data in the destroy() callback. We can manually keep a separate global association of data to user_data, but that's rather awkard and dangerous. I'd rather that it was just something like this: tyepdef void (*GDestroyNotify) (gpointer data, gpointer user_data) void g_memory_input_stream_add_data_full (GMemoryInputStream *stream, const void *data, gssize len, GDestroyDataNotify destroy, gpointer destroy_user_data); Thoughts?
Sorry for overlooking that. Agreed that it sounds more practical to have both in the same callback (I was rather under the impression that every chunk of 'data' would be accompanied by a unique 'user_data', having both in the same callback obviously provides a more flexible api).
Created attachment 196911 [details] [review] 0001-Add-g_memory_input_stream_new_from_data_full-add_dat2.patch This one does what I really want.
Created attachment 196912 [details] [review] 0001-Add-an-extra-state-pointer-and-an-extra-GDestroyNoti3.patch Sorry. This is the correct patch.
Could I get some review for this patch, please?
Review of attachment 196912 [details] [review]: Executive summary: Patch is overkill; we just need a new API that takes a regular GDestroyNotify and calls it with the passed user data, and we should deprecate g_memory_input_stream_add_data() since it has a confusingly different API from everything else. Commit message is all on one line, but should have separate subject/body. Bug link should just be a full URL. See: https://live.gnome.org/GnomeLove/SubmittingPatches Something like: gmemoryinputstream: Add new _from_data_full API that also takes a user data This allows bindings to attach extra state to memory chunks. https://bugzilla.gnome.org/show_bug.cgi?id=589887 ::: gio/gmemoryinputstream.c @@ +49,3 @@ GDestroyNotify destroy; + GMemoryInputStreamDataDestroyNotify destroy_with_user_data; + gpointer user_data; And in fact since you wrote this patch, the internal Chunk was removed in favor of GBytes. I think a new API could simply make the semantic change that the user data is no longer the pointer to the data, but the pointer to the passed state. Then all it does internally is: bytes = g_bytes_new_with_free_func (data, len, destroy, user_data); You can even do this outside of GMemoryInputStream today (without your patch) by simply using g_memory_input_stream_add_bytes(). @@ +245,3 @@ + * Returns: new #GInputStream read from @data of @len bytes. + * + * Since: 2.30 Will be 2.40 now. @@ +314,3 @@ + * the life cycle of data chunks from language bindings). + * + * Since: 2.30 2.40 ::: gio/gmemoryinputstream.h @@ +85,3 @@ + */ +typedef void (*GMemoryInputStreamDataDestroyNotify) (gpointer data, + gpointer user_data); I'm dubious about the necessity of a new callback type. Other things of this type like GBytes just use a normal one-argument GDestroyNotify. I suspect in most/all cases where you have extra state associated with the pointer, your state already contains a pointer to the data. I understand that it's free for us to provide both, but on the other hand language bindings aren't going to understand this new callback signature out of the box, and it's different again from GBytes. @@ +87,3 @@ + gpointer user_data); + +GInputStream * g_memory_input_stream_new_from_data_full (const void *data, Needs GLIB_AVAILABLE_IN_2_40 @@ +95,3 @@ gssize len, GDestroyNotify destroy); +void g_memory_input_stream_add_data_full (GMemoryInputStream *stream, Ditto.
(In reply to comment #23) > You can even do this outside of GMemoryInputStream today (without your patch) > by simply using g_memory_input_stream_add_bytes(). Yes, g_memory_input_stream_add_bytes() and g_memory_input_stream_new_from_bytes() give us the functionality we need in glibmm. g_bytes_new_with_free_func() has the user_data parameter we need. Our problem was solved when the g_memory_input_stream_*_bytes() functions were added. Closing this bug.