GNOME Bugzilla – Bug 789755
g_get_host_name: ensure return value is always UTF8 encoded
Last modified: 2017-11-06 10:33:05 UTC
The attached patch ensures that g_get_host_name always returns a UTF8 encoded string on Windows. Previously the string would be encoded in the current codepage, which is not very nice to work with in a platform independent manner...
Created attachment 362730 [details] [review] g_get_host_name returns always a UTF8 encoded string
Created attachment 362731 [details] [review] g_get_host_name returns always a UTF8 encoded string
Review of attachment 362731 [details] [review]: A few comments. Can you also please amend the documentation for g_get_host_name() to clarify that the return value is always UTF-8 encoded, and add a test to glib/tests/utils.c which checks that the return value is valid UTF-8? Technically this is a behaviour change which could be considered an API break, although I think it’s safe to make this change: • Not that many people are going to have used g_get_host_name() on Windows • Almost all host names are ASCII anyway • The documentation never specified the encoding, and no GIR annotation was ever added saying it was not a UTF-8 string (by default, all strings in GLib are UTF-8 unless otherwise specified) ::: glib/gutils.c @@ +988,3 @@ + gchar *tmp = g_malloc (sizeof (gchar) * 100); + failed = (gethostname (tmp, sizeof (gchar) * 100) == -1); + utmp = tmp; There doesn’t seem to be a need to change the non-Windows code to add an allocation there; please don’t do that. @@ +994,3 @@ + failed = (!GetComputerNameW (tmp, &size)); + if (!failed) + utmp = g_utf16_to_utf8 (tmp, -1, NULL, NULL, NULL); If this conversion fails, utmp will be NULL, but failed will be FALSE, which will cause problems with the g_once_init_leave() call below. Also, instead of passing -1 in, GetComputerNameW will have set `size` appropriately on output, so you could pass that in as the second argument to g_utf16_to_utf8().
Matthias, what do you think about this being an API break or not?
Created attachment 362742 [details] [review] g_get_host_name returns always a UTF8 encoded string The updated patch addresses your comments except the allocation, which I think is necessary to avoid a memory leak as I would g_strdup the return value of g_utf16_to_utf8 otherwise in the call to g_once_init_leave().
Review of attachment 362742 [details] [review]: Looks good to me. Waiting for feedback from another maintainer about the potential API break.
Ok thanks for your review. By the way, if you have a moment, I would be grateful if you could have a look at two other tickets I created that have gone stale :-) https://bugzilla.gnome.org/show_bug.cgi?id=777308 https://bugzilla.gnome.org/show_bug.cgi?id=777310 Thanks in advance!
Discussed on IRC with Matthias and Emmanuele, and there seem to be no objections to this behaviour change.
Thanks a lot!
I just pushed 0d49cd1b11b1bd0bae2671da9ddd83a6227eb947 to fix a minor memory leak if (error == TRUE) in the non-Windows path here.