After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 520116 - g_utf8_strlcpy()
g_utf8_strlcpy()
Status: RESOLVED WONTFIX
Product: glib
Classification: Platform
Component: i18n
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2008-03-03 15:38 UTC by Behdad Esfahbod
Modified: 2018-02-08 00:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Proposed implementation (1.07 KB, text/plain)
2008-11-03 15:55 UTC, Philip Page
  Details
Test cases. (1.54 KB, text/plain)
2008-11-03 15:56 UTC, Philip Page
  Details
docs: Clarify dest requirements of g_utf8_strncpy() (867 bytes, patch)
2017-11-28 13:27 UTC, Patrick Griffis (tingping)
none Details | Review
Add g_utf8_strlcpy() (4.72 KB, patch)
2017-11-28 13:28 UTC, Patrick Griffis (tingping)
none Details | Review
Add g_utf8_strlcat() (3.07 KB, patch)
2017-11-28 13:28 UTC, Patrick Griffis (tingping)
none Details | Review
docs: Clarify dest requirements of g_utf8_strncpy() (803 bytes, patch)
2017-11-28 13:31 UTC, Patrick Griffis (tingping)
committed Details | Review
Add g_utf8_strlcpy() (4.74 KB, patch)
2017-11-28 18:42 UTC, Patrick Griffis (tingping)
none Details | Review
Add g_utf8_strlcat() (3.09 KB, patch)
2017-11-28 18:43 UTC, Patrick Griffis (tingping)
none Details | Review
Add g_utf8_strlcat() (3.10 KB, patch)
2017-11-28 18:48 UTC, Patrick Griffis (tingping)
none Details | Review

Description Behdad Esfahbod 2008-03-03 15:38:22 UTC
Works like g_strlcpy(), but doesn't copy a partial character at the end of the buffer.  Will copy whole UTF-8 characters only, as much as fits.
Comment 1 Philip Page 2008-11-03 15:55:08 UTC
Created attachment 121880 [details]
Proposed implementation
Comment 2 Philip Page 2008-11-03 15:56:09 UTC
Created attachment 121881 [details]
Test cases.
Comment 3 Patrick Griffis (tingping) 2017-11-28 13:27:57 UTC
Created attachment 364558 [details] [review]
docs: Clarify dest requirements of g_utf8_strncpy()
Comment 4 Patrick Griffis (tingping) 2017-11-28 13:28:14 UTC
Created attachment 364559 [details] [review]
Add g_utf8_strlcpy()
Comment 5 Patrick Griffis (tingping) 2017-11-28 13:28:31 UTC
Created attachment 364560 [details] [review]
Add g_utf8_strlcat()
Comment 6 Patrick Griffis (tingping) 2017-11-28 13:31:22 UTC
Created attachment 364561 [details] [review]
docs: Clarify dest requirements of g_utf8_strncpy()
Comment 7 Emmanuele Bassi (:ebassi) 2017-11-28 14:04:30 UTC
Review of attachment 364559 [details] [review]:

::: glib/gutf8.c
@@ +458,3 @@
+g_utf8_strlcpy (gchar       *dest,
+                const gchar *src,
+                size_t       n)

This should be `gsize`.

@@ +460,3 @@
+                size_t       n)
+{
+  register const gchar *s = src;

`register` is not really used, unless you're targeting a compiler from the '90s.

@@ +463,3 @@
+  while (s - src < n  &&  *s)
+    {
+      s = g_utf8_next_char(s);

Coding style:

 - single statement blocks do not need curly braces
 - missing space between function name and parenthesis

@@ +467,3 @@
+  if (s - src >= n)
+    {
+      /* We need to truncate; back up one. */

As above, coding style issues:

 - single statement blocks do not need curly braces
 - missing space between function name and parenthesis

::: glib/tests/utf8-misc.c
@@ +76,3 @@
 
+static void
+test_utf8_strlcpy (void)

Coding style throughout: missing space between function name and parenthesis.
Comment 8 Emmanuele Bassi (:ebassi) 2017-11-28 14:06:00 UTC
Review of attachment 364560 [details] [review]:

::: glib/gunicode.h
@@ +775,3 @@
 
+GLIB_AVAILABLE_IN_2_56
+size_t   g_utf8_strlcat           (gchar       *dest,

Should be `gsize`.

::: glib/gutf8.c
@@ +500,3 @@
+ *
+ * Returns: Length in bytes of @src
+ **/

Missing `Since` annotation, and the gtk-doc stanza should close with `*/`.

@@ +501,3 @@
+ * Returns: Length in bytes of @src
+ **/
+size_t

This should be `gsize`.

@@ +504,3 @@
+g_utf8_strlcat (gchar       *dest,
+                const gchar *src,
+                size_t       n)

This should be `gsize`.
Comment 9 Emmanuele Bassi (:ebassi) 2017-11-28 14:06:31 UTC
Review of attachment 364561 [details] [review]:

Looks good
Comment 10 Patrick Griffis (tingping) 2017-11-28 18:42:55 UTC
Created attachment 364584 [details] [review]
Add g_utf8_strlcpy()
Comment 11 Patrick Griffis (tingping) 2017-11-28 18:43:09 UTC
Created attachment 364585 [details] [review]
Add g_utf8_strlcat()
Comment 12 Patrick Griffis (tingping) 2017-11-28 18:48:27 UTC
Created attachment 364587 [details] [review]
Add g_utf8_strlcat()
Comment 13 Philip Withnall 2018-02-03 11:14:16 UTC
Comment on attachment 364561 [details] [review]
docs: Clarify dest requirements of g_utf8_strncpy()

I pushed the a_c-n patch with a minor wording tweak.

Attachment 364561 [details] pushed as 1c0bed9 - docs: Clarify dest requirements of g_utf8_strncpy()
Comment 14 Philip Withnall 2018-02-03 11:18:59 UTC
Taking a step back, what’s the use case for these functions? Copying UTF-8 into fixed-size buffers: but who uses fixed-size buffers? i.e. Which applications/libraries are lined up to use this, and what’s the reason they’re not using g_strdup()? I’m sure there are good answers to all of these questions, but I’d rather not take more API into GLib without knowing them.
Comment 15 Philip Page 2018-02-03 22:57:23 UTC
When I initially needed it, we had fixed size columns in a database. We naturally wanted to preserve the validity of the UTF-8 string even if we had to truncate.

In the intervening nine years (!) we have moved on to C++ and don't use glib anymore.
Comment 16 Philip Withnall 2018-02-04 10:50:16 UTC
(In reply to Philip Page from comment #15)
> When I initially needed it, we had fixed size columns in a database. We
> naturally wanted to preserve the validity of the UTF-8 string even if we had
> to truncate.

Yeah, that makes sense. I’m not sure it’s a general enough use case for GLib to cater to, though.

> In the intervening nine years (!) we have moved on to C++ and don't use glib
> anymore.

Sorry for the delay, and thanks for replying. It’s useful to get this feedback.

Since the docs fix has been pushed, I’m going to close this as WONTFIX. Patrick, if you have a use case which is still relevant, please re-open the report with it and we can consider these APIs further.
Comment 17 Patrick Griffis (tingping) 2018-02-08 00:18:16 UTC
(In reply to Philip Withnall from comment #14)
> Taking a step back, what’s the use case for these functions? Copying UTF-8
> into fixed-size buffers: but who uses fixed-size buffers? i.e. Which
> applications/libraries are lined up to use this, and what’s the reason
> they’re not using g_strdup()? I’m sure there are good answers to all of
> these questions, but I’d rather not take more API into GLib without knowing
> them.

Hexchat uses them extensively, largely this is just because of legacy though. There are some places interacting with fixed-limit protocols where you still don't want to have invalid utf-8 and at a glance there isn't even an allocating API in glib to copy valid utf-8 up-to bytes limit?