GNOME Bugzilla – Bug 529773
Crash with non UTF-8 locales
Last modified: 2009-06-30 14:09:57 UTC
I'm trying to use gnome-screensaver with a pt_PT@euro locale, and I'm getting a segmentation fault. Looks like libgnomekbd is trying to set a tooltip text without making sure that the string is UTF-8. $ LC_ALL=pt_PT@euro gdb /usr/lib/gnome-screensaver/gnome-screensaver-dialog [...] Program received signal SIGSEGV, Segmentation fault.
+ Trace 196046
Thread 3073247008 (LWP 19435)
If I set LC_ALL to pt_PT.UTF-8 and then run the gnome-screensaver then it works perfectly. See also bug #511096.
Dupe of bug#449272.
libgnomekbd is using standard gettext for building the tooltips. Should it perform some extra conversion to make sure the result is in utf8? Also, in non-utf locales, why should the tooltip text be in utf? I do no have non-utf locales, so I cannot really reproduce the bug...
I've debugged a bit and I think I've found the problem (and it's in libxklavier actually). GTK uses UTF-8 internally no matter the user's locale, so gtk_widget_set_tooltip_text() expects a string encoded in UTF-8. To make gettext return UTF-8 strings, you have to force it using bind_textdomain_codeset(). Otherwise it will encode them in the user's locale. The problem is that libxklavier doesn't make that conversion, all strings are returned in the user's locale, so trying to pass them to GTK widgets is wrong. The solution is simple: make libxklavier force the output locale in gettext. I'm attaching a sample patch that does this. I've just tested it here and it seems to work.
Created attachment 117883 [details] [review] libxklavier: make gettext return strings in UTF-8
Very well spotted! But I would prefer to keep that fix in libgnomekbd (so that libxklavier would not depend on GTK ideas about utf8). Could you please put bind_textdomain_codeset into gkbd_indicator_global_init (in libgnomekbd/gkbd-indicator.c) and see whether it works for you or not? Thanks
Well, two things about that: - UTF-8 is not only a GTK thing, GLib uses UTF-8 for its strings too. As a programmer I would expect strings in XklConfigItem (being a GObject, so based on glib) to be in UTF-8 rather than the user's locale, as I would have to convert them to be able to use them. If everything is in UTF-8 the programmer doesn't need to care about encodings, GTK/Glib will do the dirty work. - To call bind_textdomain_codeset() from libgnomekbd I would need to know the value of XKB_DOMAIN, which is private in xklavier_config.c In other words, gettext is used _internally_ by libxklavier, it's an implementation detail and anyone using that library shouldn't care about how libxklavier gets the data. IMHO libgnomekbd shouldn't mess with the way libxklavier works internally, so calling bind_textdomain_codeset() from there doesn't look like a good idea to me :)
Well, making XKB_DOMAIN public is not a big issue as such. But in general your arguments make sense. I will think of it a bit more...
Thanks ! In apps or libraries based on GTK/Glib I would not store strings in an encoding different from UTF-8 as that encoding is expected everywhere. I wouldn't do it in general, let alone in a public API like this one. Even functions like g_print() or g_debug() expect their input to be in UTF-8. If I try this in my Latin-1 system: const char *str = "München\n"; /* This is encoded in Latin-1 */ printf(str); g_print(str); I'll get this output: München [Invalid UTF-8] M\xfcnchen So unless there's a good reason why libklavier should return strings in another encoding I would use UTF-8. If the user really needs them in a different encoding (which is very unlikely when using GLib-based code) they can be converted manually.
Ok, committed. Thanks for convincing me:)
Unfortunately things are not always that easy :( While this commit solves two crashes in at least gnome-screensaver and gnome-keyboard-applet at least, it introduces new bugs in gnome-keyboard-properties from the control center. The problem is that that dialog re-encodes the string again, so instead of displaying, say, "España" it displays "España". At least it is not a crash (neither one of the two programs I mentioned before could be used at all) so it is much better than what we had before, but the problem has to be fixed anyway. So I can think of two solutions: * Revert the last patch in libxklavier and change libgnomekbd to convert the strings to utf-8. It's probably enough to change these calls: *sld = g_strdup (item->short_description); *lld = g_strdup (item->description); *svd = g_strdup (item->short_description); *lvd = g_strdup (item->description); in gkb-desktop-config.c to use g_locale_to_utf8() instead of g_strdup(), but I haven't tested it yet. * Change gnome-keyboard-properties to assume that those strings are in UTF-8. The second option looks saner to me (because of what I explained in previous comments), but I haven't seen the code much to check if it's difficult or not. Anyway I've just checked the documentation of libxklavier and I don't see any information about the encoding of the strings in XklConfigItem, so if it's not going to be UTF-8 I'd state it clearly there to avoid this kind of problems.
Just tested it, replacing g_strdup() with g_locale_to_utf8() solved the problem, at least for this case, but I don't think it's a good idea in the end because with this we have strings in different locales depending on the place. I'd go for my original proposal, which is: everything in UTF-8 from the beginning (that is, from the call to gettext()) and no charset conversion at all. For the problem in gnome-keyboard-properties I think it's enough if we replace the call to xci_desc_to_utf8() (no conversion is needed at all if everything is in UTF-8 from the start). I can test it here if you want to, but I'd like to hear your opinion.
Alberto, would you try replacing a call of g_locale_to_utf8 inside xci_desc_to_utf8 to g_strdup? Will it work for you?
Also, in xkb_layout_description_utf8 you need to make similar replacement.
Sure, I'll try in a few hours. Also, I think that we could even maintain backward compatibility with something like this: if (g_utf8_validate (string)) return g_strdup (string); else return g_locale_to_utf8 (string);
Created attachment 118050 [details] [review] GNOME keyboard properties: don't re-encode strings that are already UTF-8 I've just tested, it works fine, now everything appears correctly. I'm attaching the patch. I've also added a test that checks whether the string is UTF-8 or not, to avoid surprises when using and unpatched version of libxklavier.
Thanks for the patch, hopefully we'll close the issue. The only thing is that I don't want to use g_utf8_validate - I just will use strdup (enforcing switch to updated libxklavier, to be released ASAP).
As you wish, but take into account that if you use an unpatched version of libxklavier you'll most likely get a segmentation fault.
I think it is ok. We are talking about unstable version of GNOME. When 2.24 is out, the new version of libxklavier will be available for sure.
Committed. Please check from svn.
I've just seen it, it looks correct, thanks ! :)
*** Bug 523583 has been marked as a duplicate of this bug. ***
*** Bug 559010 has been marked as a duplicate of this bug. ***
*** Bug 559060 has been marked as a duplicate of this bug. ***
*** Bug 562787 has been marked as a duplicate of this bug. ***
*** Bug 562921 has been marked as a duplicate of this bug. ***
*** Bug 563086 has been marked as a duplicate of this bug. ***
*** Bug 563287 has been marked as a duplicate of this bug. ***
*** Bug 564489 has been marked as a duplicate of this bug. ***
*** Bug 567017 has been marked as a duplicate of this bug. ***
*** Bug 564358 has been marked as a duplicate of this bug. ***
*** Bug 565601 has been marked as a duplicate of this bug. ***
*** Bug 580210 has been marked as a duplicate of this bug. ***
*** Bug 508758 has been marked as a duplicate of this bug. ***