GNOME Bugzilla – Bug 520116
g_utf8_strlcpy()
Last modified: 2018-02-08 00:18:16 UTC
Works like g_strlcpy(), but doesn't copy a partial character at the end of the buffer. Will copy whole UTF-8 characters only, as much as fits.
Created attachment 121880 [details] Proposed implementation
Created attachment 121881 [details] Test cases.
Created attachment 364558 [details] [review] docs: Clarify dest requirements of g_utf8_strncpy()
Created attachment 364559 [details] [review] Add g_utf8_strlcpy()
Created attachment 364560 [details] [review] Add g_utf8_strlcat()
Created attachment 364561 [details] [review] docs: Clarify dest requirements of g_utf8_strncpy()
Review of attachment 364559 [details] [review]: ::: glib/gutf8.c @@ +458,3 @@ +g_utf8_strlcpy (gchar *dest, + const gchar *src, + size_t n) This should be `gsize`. @@ +460,3 @@ + size_t n) +{ + register const gchar *s = src; `register` is not really used, unless you're targeting a compiler from the '90s. @@ +463,3 @@ + while (s - src < n && *s) + { + s = g_utf8_next_char(s); Coding style: - single statement blocks do not need curly braces - missing space between function name and parenthesis @@ +467,3 @@ + if (s - src >= n) + { + /* We need to truncate; back up one. */ As above, coding style issues: - single statement blocks do not need curly braces - missing space between function name and parenthesis ::: glib/tests/utf8-misc.c @@ +76,3 @@ +static void +test_utf8_strlcpy (void) Coding style throughout: missing space between function name and parenthesis.
Review of attachment 364560 [details] [review]: ::: glib/gunicode.h @@ +775,3 @@ +GLIB_AVAILABLE_IN_2_56 +size_t g_utf8_strlcat (gchar *dest, Should be `gsize`. ::: glib/gutf8.c @@ +500,3 @@ + * + * Returns: Length in bytes of @src + **/ Missing `Since` annotation, and the gtk-doc stanza should close with `*/`. @@ +501,3 @@ + * Returns: Length in bytes of @src + **/ +size_t This should be `gsize`. @@ +504,3 @@ +g_utf8_strlcat (gchar *dest, + const gchar *src, + size_t n) This should be `gsize`.
Review of attachment 364561 [details] [review]: Looks good
Created attachment 364584 [details] [review] Add g_utf8_strlcpy()
Created attachment 364585 [details] [review] Add g_utf8_strlcat()
Created attachment 364587 [details] [review] Add g_utf8_strlcat()
Comment on attachment 364561 [details] [review] docs: Clarify dest requirements of g_utf8_strncpy() I pushed the a_c-n patch with a minor wording tweak. Attachment 364561 [details] pushed as 1c0bed9 - docs: Clarify dest requirements of g_utf8_strncpy()
Taking a step back, what’s the use case for these functions? Copying UTF-8 into fixed-size buffers: but who uses fixed-size buffers? i.e. Which applications/libraries are lined up to use this, and what’s the reason they’re not using g_strdup()? I’m sure there are good answers to all of these questions, but I’d rather not take more API into GLib without knowing them.
When I initially needed it, we had fixed size columns in a database. We naturally wanted to preserve the validity of the UTF-8 string even if we had to truncate. In the intervening nine years (!) we have moved on to C++ and don't use glib anymore.
(In reply to Philip Page from comment #15) > When I initially needed it, we had fixed size columns in a database. We > naturally wanted to preserve the validity of the UTF-8 string even if we had > to truncate. Yeah, that makes sense. I’m not sure it’s a general enough use case for GLib to cater to, though. > In the intervening nine years (!) we have moved on to C++ and don't use glib > anymore. Sorry for the delay, and thanks for replying. It’s useful to get this feedback. Since the docs fix has been pushed, I’m going to close this as WONTFIX. Patrick, if you have a use case which is still relevant, please re-open the report with it and we can consider these APIs further.
(In reply to Philip Withnall from comment #14) > Taking a step back, what’s the use case for these functions? Copying UTF-8 > into fixed-size buffers: but who uses fixed-size buffers? i.e. Which > applications/libraries are lined up to use this, and what’s the reason > they’re not using g_strdup()? I’m sure there are good answers to all of > these questions, but I’d rather not take more API into GLib without knowing > them. Hexchat uses them extensively, largely this is just because of legacy though. There are some places interacting with fixed-limit protocols where you still don't want to have invalid utf-8 and at a glance there isn't even an allocating API in glib to copy valid utf-8 up-to bytes limit?