After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 662550 - GLib.Variant bytestring construction and accessors
GLib.Variant bytestring construction and accessors
Status: RESOLVED FIXED
Product: pygobject
Classification: Bindings
Component: introspection
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Nobody's working on this now (help wanted and appreciated)
Python bindings maintainers
Depends on:
Blocks:
 
 
Reported: 2011-10-23 20:14 UTC by Holger Berndt
Modified: 2011-11-03 09:36 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Fix array termination and size calculation (1.94 KB, patch)
2011-10-30 15:46 UTC, Holger Berndt
needs-work Details | Review
Fix array termination and size calculation (2.41 KB, patch)
2011-10-31 16:32 UTC, Holger Berndt
committed Details | Review

Description Holger Berndt 2011-10-23 20:14:38 UTC
I'm having issues when dealing with GLib.Variant bytestrings. When writing in an interactive python console

>>> from gi.repository import GLib
>>> print "'%s'" % GLib.Variant.new_bytestring("foo").get_bytestring()

I get seemingly random results. Most of the time, printable or non-printable characters are appended to the string "foo", so it looks like improper string termination. However, sometimes I also get only "fo", so maybe that's not the entire problem.

The problem seems to apply to construction as well as accessing. I've seen issues with the command

>>> GLib.Variant.new_bytestring("foo")

as random bytestrings were contained in the variant (though non-reproducible), while

>>> GLib.Variant.new_bytestring("foo" + chr(0))

seemed to work okay.

For accessing, I get varying results for

>>> var = GLib.Variant.new_bytestring("foo" + chr(0))
>>> print var.get_bytestring()
>>> print var.get_bytestring()
Comment 1 johnp 2011-10-24 17:59:43 UTC
I don't really have time to look at it right now but you can step through gdb to check what is happening during the encoding which is where the issue lies.  break on _pygi_marshal_to_py_array.  My guess is g_strv_length is returning the wrong len and would need to be special cased for seq_cache->item_cache->type_tag == GI_TYPE_TAG_UINT8
Comment 2 Holger Berndt 2011-10-30 15:46:25 UTC
Created attachment 200287 [details] [review]
Fix array termination and size calculation

When creating an array of element type uint8 and setting it directly with
memset(), make sure that zero-termination is respected.

When calculating the length of a zero-terminated array of type uint8,
fall back to strlen() instead of g_strv_length().
Comment 3 johnp 2011-10-31 15:05:39 UTC
Comment on attachment 200287 [details] [review]
Fix array termination and size calculation

>From 4d6e646b7205ecd45e1ab4c0a446deaf9afddd1a Mon Sep 17 00:00:00 2001
>From: Holger Berndt <hb@gnome.org>
>Date: Sun, 30 Oct 2011 16:36:32 +0100
>Subject: [PATCH] Fix array termination and size calculation
>
>When creating an array of element type uint8 and setting it directly with
>memset(), make sure that zero-termination is respected.
>
>When calculating the length of a zero-terminated array of type uint8,
>fall back to strlen() instead of g_strv_length().
>
>https://bugzilla.gnome.org/show_bug.cgi?id=662550
>---
> gi/pygi-marshal-from-py.c |    3 ++-
> gi/pygi-marshal-to-py.c   |    6 +++++-
> 2 files changed, 7 insertions(+), 2 deletions(-)
>
>diff --git a/gi/pygi-marshal-from-py.c b/gi/pygi-marshal-from-py.c
>index 0a94ffe..8880248 100644
>--- a/gi/pygi-marshal-from-py.c
>+++ b/gi/pygi-marshal-from-py.c
>@@ -794,7 +794,8 @@ _pygi_marshal_from_py_array (PyGIInvokeState   *state,
>     if (sequence_cache->item_cache->type_tag == GI_TYPE_TAG_UINT8 &&
>         PYGLIB_PyBytes_Check (py_arg)) {
>         memcpy(array_->data, PYGLIB_PyBytes_AsString (py_arg), length);
>-
>+        if (sequence_cache->is_zero_terminated)
>+            memset (array_->data+length, 0, 1)

Length is wrong for a null terminated array.  The intermediate garray needs to be constructed as length + 1 then this would work.  Of course you could also do: 
    array_->data[length] = '\0';

which might be better due to compiler optimizations.  In any case it is clearer.

>         goto array_success;
>     }
> 
>diff --git a/gi/pygi-marshal-to-py.c b/gi/pygi-marshal-to-py.c
>index 984e7c1..1d5874c 100644
>--- a/gi/pygi-marshal-to-py.c
>+++ b/gi/pygi-marshal-to-py.c
>@@ -275,7 +275,11 @@ _pygi_marshal_to_py_array (PyGIInvokeState   *state,
>         if (seq_cache->fixed_size >= 0) {
>             len = seq_cache->fixed_size;
>         } else if (seq_cache->is_zero_terminated) {
>-            len = g_strv_length ((gchar **)arg->v_pointer);
>+            if(seq_cache->item_cache->type_tag == GI_TYPE_TAG_UINT8) {
>+                len = strlen (arg->v_pointer);
>+            } else {
>+                len = g_strv_length ((gchar **)arg->v_pointer);
>+            }
>         } else {
>             GIArgument *len_arg = state->args[seq_cache->len_arg_index];
>             len = len_arg->v_long;


This part looks good but can you add a g_assert(array_ != NULL); to the top.  We pretty much return PyNone a layer above if NULL is encountered but rereading that code makes me uneasy that we don't explicitly state that NULL is not valid input.   Thanks.

>-- 
>1.7.1
Comment 4 Holger Berndt 2011-10-31 16:11:45 UTC
(In reply to comment #3)
> >+        if (sequence_cache->is_zero_terminated)
> >+            memset (array_->data+length, 0, 1)
> 
> Length is wrong for a null terminated array.  The intermediate garray needs to
> be constructed as length + 1 then this would work.

The way I understand GArray doc and code, both, the length "length" member and the "reserved_size" argument to g_array_sized_new() do not count the null termination, but the memory for the terminator is allocated when the "zero_terminated" function argument is true. So, above code should be correct.

I agree on the remaining comments, so I'll attach an updated patch.
Comment 5 Holger Berndt 2011-10-31 16:32:45 UTC
Created attachment 200349 [details] [review]
Fix array termination and size calculation

When creating an array of element type uint8 and setting it directly with
memset(), make sure that zero-termination is respected.

When calculating the length of a zero-terminated array of type uint8,
fall back to strlen() instead of g_strv_length().
Comment 6 johnp 2011-11-01 15:40:33 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > >+        if (sequence_cache->is_zero_terminated)
> > >+            memset (array_->data+length, 0, 1)
> > 
> > Length is wrong for a null terminated array.  The intermediate garray needs to
> > be constructed as length + 1 then this would work.
> 
> The way I understand GArray doc and code, both, the length "length" member and
> the "reserved_size" argument to g_array_sized_new() do not count the null
> termination, but the memory for the terminator is allocated when the
> "zero_terminated" function argument is true. So, above code should be correct.

You are correct.  For some reason I was looking for that in the API but didn't see the parameter.  Code reading fatigue I guess.
Comment 7 johnp 2011-11-01 15:41:55 UTC
Comment on attachment 200349 [details] [review]
Fix array termination and size calculation

Awesome.  Thanks for fixing this.
Comment 8 Holger Berndt 2011-11-01 17:53:31 UTC
Attachment 200349 [details] pushed as eef35b2 - Fix array termination and size calculation

Thanks for the debugging hints and the super-fast review!
Comment 9 Tomeu Vizoso 2011-11-02 13:52:22 UTC
Hey, sorry but this is breaking the tests. If it cannot be fixed quickly (today), then I think we should revert that commit for now.
Comment 10 Tomeu Vizoso 2011-11-02 13:54:14 UTC
Oh, and why there are no tests? I know that often it takes more time to write a test than the fix (or at least it does for me), but this is really important if we don't want to break other people's code every day.

I'm reverting the commit for now.
Comment 11 johnp 2011-11-02 18:56:12 UTC
Ah, the crash is my fault.  I asked him to add a global g_assert(array_!=NULL) but that should only be applied to certain C arrays.  If the array is NULL and we hit strlen we will crash in the C array case.  I committed a fix which correctly handles NULL.

We assert for fixed arrays and NUL terminated arrays. since they always have to have a length of 1 even if it is just the NUL bit and an interface that sends in a fixed array of size 0 is just broken.

Ones with length parameters can still be NULL and send in a length of 0 which is already handled properly.

I also committed a test for byte array variants.
Comment 12 Holger Berndt 2011-11-02 21:32:36 UTC
Thanks, John. I guess this bug can then go back to "Fixed" state.

In fact, I skimmed through the tests to find a suitable place before commiting, but none of the GLib.Variant.new_* functions is tested (except some random examples as sideeffects of overrides-testing), so it didn't look like the policy required these kinds of tests. Good to know otherwise.
Comment 13 Tomeu Vizoso 2011-11-03 09:36:36 UTC
(In reply to comment #12)
> Thanks, John. I guess this bug can then go back to "Fixed" state.

Excellent, thanks!

> In fact, I skimmed through the tests to find a suitable place before commiting,
> but none of the GLib.Variant.new_* functions is tested (except some random
> examples as sideeffects of overrides-testing), so it didn't look like the
> policy required these kinds of tests. Good to know otherwise.

In principle, we really need tests whenever we change stuff which will affect most applications out there, as is the case with marshalling.

For other cases such as overrides, I think it's good to have and worth making it a requirement, but it's not that critical.