GNOME Bugzilla – Bug 623615
It should convert gunichars to strings
Last modified: 2010-11-29 20:03:24 UTC
I think gunichars should be converted to strings of length 1. This was also done by pygtk
I would +1 the patch if it is submitted
Created attachment 174943 [details] [review] patch Support for unichar was added to g-i in commit http://git.gnome.org/browse/gobject-introspection/commit/?id=c8940a11562fd7b888595c6298e39836192fa3d7 This patch adds support for it in pygobject. Some comments about the patch: 1 - I added the the a /*TODO*/ about checking the string len in _pygi_g_type_info_check_object... but to do it we would need tp decode to utf8 etc, so basically completely duplicating the logic is later done to perform the actual conversion. I still do not understand why pygi does this two-steps logic. 2 - I would have preferred to put the test in the marshalling tests along all the the other basic types, but walters put the g-i test function in regress.c
Created attachment 174964 [details] [review] patch I forgot to patch also pygi-info. The provious comments are still valid.
Created attachment 175078 [details] [review] [gi] add check for UNICHAR
I added a patch for checking a GUNICHAR which may or may not be wrong depending on how we read treat GUNICHARs. Right now you are treating it like a UTF-8 string. Is this correct or should it be treated more like how I treat UINT8 where we expect interfaces to return a single UINT8 or an array of UINT8's? If gunichar foo() is equivalent to gunichar *foo() then your patch looks correct. However if gunichar *foo() follows the array code path your patch will fail.
for what is worth I started off from this gjs patch which treats unichars as utf8 http://git.gnome.org/browse/gjs/commit/?id=34fdc293117109f003967aec397b0e87a33c9f53
Review of attachment 175078 [details] [review]: ::: gi/pygi-argument.c @@ +395,3 @@ + } + + if (size != 1) { I think this is wrong. Unless I misunderstood py api docs PyString_GET_SIZE will give you the size is bytes. A unichar can be more than a byte and yet be a single utf8 character. What we need to check is that we are getting a single valid utf8 character.
Created attachment 175120 [details] [review] Fixes check and tests git-bz barfs on unicode characters because of the http python module so I posted this manually. The check now transforms strings in to unicode before calling GET_SIZE in order to get the string length, not the byte array length. I also fixed the tests to work in py3 and py2 as well as reflect the new checks which does not allow empty gunichars to be passed. Conceptually this is correct but in reality do we want to allow "" to be a valid unichar? What does that map to? As far as the original patch is concerned, I believe it to be correct though we don't support arrays of UNICHAR. I think that just maps to UTF-8 anyway so we should be good.
Review of attachment 175120 [details] [review]: Thanks for updating the patch. It looks good to me now.