After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 633199 - Support gunichar
Support gunichar
Status: RESOLVED FIXED
Product: gjs
Classification: Bindings
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: gjs-maint
gjs-maint
Depends on:
Blocks:
 
 
Reported: 2010-10-26 15:18 UTC by Colin Walters
Modified: 2010-11-18 01:09 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Support gunichar (4.21 KB, patch)
2010-10-26 15:18 UTC, Colin Walters
reviewed Details | Review

Description Colin Walters 2010-10-26 15:18:02 UTC
This is a new fundamental type tag.
Comment 1 Colin Walters 2010-10-26 15:18:04 UTC
Created attachment 173266 [details] [review]
Support gunichar
Comment 2 Owen Taylor 2010-11-17 20:59:46 UTC
Review of attachment 173266 [details] [review]:

Basically right, couple of small things.

::: gi/arg.c
@@ +1665,3 @@
+            gint bytes;
+
+            bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8);

You need to validate the character with  g_unichar_validate  - gjs_string_from_utf8() uses g_utf8_to_utf16 which assumes valid characters in the  input input. (it does check the UTF-8 encoding, but will happily pass an isolated surrogate pair into the output or whatever.) Note that 0 will not validate, so you need to handle that separately to preserve your mapping of 0 to "".

::: gjs/jsapi-util-string.c
@@ +457,3 @@
+ * If successful, @result is assigned the Unicode codepoint
+ * corresponding to the first full character in @string.  This
+ * function handles surrogate pairs.

Probably better more accuratesay "this function handles characters not in the BMP" or "not in the BMP which are represented as a surrogate pair in Spidermonkey's internal UTF-16 representation."

::: test/js/testEverythingBasic.js
@@ +75,3 @@
     assertEquals(-42, Everything.test_double(-42));
 
+    assertEquals("c", Everything.test_unichar("c"));

I'd add more tests here:

 - Empty string
 - Non-bmp character
Comment 3 Colin Walters 2010-11-17 23:39:55 UTC
(In reply to comment #2)
> Review of attachment 173266 [details] [review]:
> 
> Basically right, couple of small things.
> 
> ::: gi/arg.c
> @@ +1665,3 @@
> +            gint bytes;
> +
> +            bytes = g_unichar_to_utf8 (arg->v_uint32, &utf8);
> 
> You need to validate the character with  g_unichar_validate  -

Wouldn't it be a bug in the C code if it returned an invalid gunichar, just like if it returned invalid UTF-8 for a char *?  Hmm, I guess we're currently throwing in that case, so we might as well also do the same for an invalid gunichar.