Bug 633199 - Support gunichar
Support gunichar
Status: RESOLVED FIXED
Product: gjs
Classification: Bindings
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: gjs-maint
gjs-maint
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2010-10-26 15:18 UTC by Colin Walters
Modified: 2010-11-18 01:09 UTC (History)
0 users

See Also:
GNOME target: ---
GNOME version: ---


Attachments
Support gunichar (4.21 KB, patch)
2010-10-26 15:18 UTC, Colin Walters
reviewed Details | Diff | Review

Description Colin Walters 2010-10-26 15:18:02 UTC
This is a new fundamental type tag.
Comment 1 Colin Walters 2010-10-26 15:18:04 UTC
Created attachment 173266 [details] [review]
Support gunichar
Comment 2 Owen Taylor 2010-11-17 20:59:46 UTC
Review of attachment 173266 [details] [review]:

Basically right, couple of small things.

::: gi/arg.c
@@ +1665,3 @@
+            gint bytes;
+
+            bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8);

You need to validate the character with  g_unichar_validate  - gjs_string_from_utf8() uses g_utf8_to_utf16 which assumes valid characters in the  input input. (it does check the UTF-8 encoding, but will happily pass an isolated surrogate pair into the output or whatever.) Note that 0 will not validate, so you need to handle that separately to preserve your mapping of 0 to "".

::: gjs/jsapi-util-string.c
@@ +457,3 @@
+ * If successful, @result is assigned the Unicode codepoint
+ * corresponding to the first full character in @string.  This
+ * function handles surrogate pairs.

Probably better more accuratesay "this function handles characters not in the BMP" or "not in the BMP which are represented as a surrogate pair in Spidermonkey's internal UTF-16 representation."

::: test/js/testEverythingBasic.js
@@ +75,3 @@
     assertEquals(-42, Everything.test_double(-42));
 
+    assertEquals("c", Everything.test_unichar("c"));

I'd add more tests here:

 - Empty string
 - Non-bmp character
Comment 3 Colin Walters 2010-11-17 23:39:55 UTC
(In reply to comment #2)
> Review of attachment 173266 [details] [review]:
> 
> Basically right, couple of small things.
> 
> ::: gi/arg.c
> @@ +1665,3 @@
> +            gint bytes;
> +
> +            bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8);
> 
> You need to validate the character with  g_unichar_validate  -

Wouldn't it be a bug in the C code if it returned an invalid gunichar, just like if it returned invalid UTF-8 for a char *?  Hmm, I guess we're currently throwing in that case, so we might as well also do the same for an invalid gunichar.

Note You need to log in before you can comment on or make changes to this bug.