GNOME Bugzilla – Bug 131218
g_logv crashes if argument is non-utf8
Last modified: 2011-02-18 16:07:18 UTC
While using evolution 1.5.2 on Gnome 2.4, i am getting a crash as shown in the trace below ... (gdb) thread 9 [Switching to thread 9 (Thread 1116949808 (LWP 576))]#0 0xffffe002 in ?? () (gdb) bt
+ Trace 43162
I did some analysis on this and found that this occuring because the value of the variable "length" in g_string_erase is getting miscalulated. But, this is getting jumbled up since, we give a non-utf-8 character as an input to g_strdup_vprintf in g_logv. I think, we need to check and convert the arguments passed to utf-8 before sending them to g_strdup_vprintf.
What i meant in the last statement above was that ... "we need to check if the argument is utf-8 and convert it to utf-8 if it is not, before sending the argument to g_strdup_vprintf"
You cannot convert an arbitrary byte string to UTF-8. At best you could detect if its not valid UTF-8 and display it as hex or octal escapes. I guess this is what escape_string should do. I think we should do this before 2.4, since it is bad when logging crashes, and this is new behaviour which got introduced with escape_string() after 2.2.
Created attachment 23915 [details] [review] a patch
I don't think the patch is quite right. From g_log_default_handler() if (g_get_charset (&charset)) g_string_append (gstring, message); /* charset is UTF-8 already */ else { string = strdup_convert (message, charset); g_string_append (gstring, string); g_free (string); } escape_string (gstring); So, we're actually calling escape_string, which expects UTF-8 *after* we convert to the locale charset! So, that needs to be cleared up. Note also that in the non-UTF-8 case where we call strdup_convert, we're already checking for valid UTF-8, and there is code to handle invalid UTF-8 as escape sequences. checking for invalid
Created attachment 25175 [details] [review] a new patch
The new patch moves escape_string to before the conversion, so that the string is still (possibly invalid) UTF-8.
Applied with a couple of changes: + g_utf8_get_char_validated (p, 3); - g_utf8_get_char_validated (p, -1); (UTF-8 sequences can be longer than 3, and -1 does the right thing for a null-terminated string) - tmp = g_strdup_printf ("\\x%02hhx", (guint)*p); + tmp = g_strdup_printf ("\\x%02x", (guint)*p); hh is C99. I also changed strdup_convert() to use hex rather than octal to match. Sun Mar 14 13:56:48 2004 Owen Taylor <otaylor@redhat.com> * glib/gmessages.c (escape_string): Handle invalid UTF-8. (#131218, patch from Matthias Clasen)
*** Bug 139441 has been marked as a duplicate of this bug. ***
Reopening, since bug 139441 has a systematic crash in escape_string when using the zh_CN.GBK locale.
Reclosing. Bug 139030 covers the new crash.