GNOME Bugzilla – Bug 633199
Support gunichar
Last modified: 2010-11-18 01:09:38 UTC
This is a new fundamental type tag.
Created attachment 173266 [details] [review] Support gunichar
Review of attachment 173266 [details] [review]: Basically right, couple of small things. ::: gi/arg.c @@ +1665,3 @@ + gint bytes; + + bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8); You need to validate the character with g_unichar_validate - gjs_string_from_utf8() uses g_utf8_to_utf16 which assumes valid characters in the input input. (it does check the UTF-8 encoding, but will happily pass an isolated surrogate pair into the output or whatever.) Note that 0 will not validate, so you need to handle that separately to preserve your mapping of 0 to "". ::: gjs/jsapi-util-string.c @@ +457,3 @@ + * If successful, @result is assigned the Unicode codepoint + * corresponding to the first full character in @string. This + * function handles surrogate pairs. Probably better more accuratesay "this function handles characters not in the BMP" or "not in the BMP which are represented as a surrogate pair in Spidermonkey's internal UTF-16 representation." ::: test/js/testEverythingBasic.js @@ +75,3 @@ assertEquals(-42, Everything.test_double(-42)); + assertEquals("c", Everything.test_unichar("c")); I'd add more tests here: - Empty string - Non-bmp character
(In reply to comment #2) > Review of attachment 173266 [details] [review]: > > Basically right, couple of small things. > > ::: gi/arg.c > @@ +1665,3 @@ > + gint bytes; > + > + bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8); > > You need to validate the character with g_unichar_validate - Wouldn't it be a bug in the C code if it returned an invalid gunichar, just like if it returned invalid UTF-8 for a char *? Hmm, I guess we're currently throwing in that case, so we might as well also do the same for an invalid gunichar.