After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 529773 - Crash with non UTF-8 locales
Crash with non UTF-8 locales
Status: RESOLVED FIXED
Product: libgnomekbd
Classification: Core
Component: Indicator
2.22.x
Other Linux
: Normal normal
: ---
Assigned To: libgnomekbd maintainers
Sergey V. Udaltsov
: 508758 523583 559010 559060 562787 562921 563086 563287 564358 564489 565601 567017 580210 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2008-04-24 20:07 UTC by Alberto Garcia
Modified: 2009-06-30 14:09 UTC
See Also:
GNOME target: ---
GNOME version: 2.21/2.22


Attachments
libxklavier: make gettext return strings in UTF-8 (617 bytes, patch)
2008-09-02 21:57 UTC, Alberto Garcia
none Details | Review
GNOME keyboard properties: don't re-encode strings that are already UTF-8 (1.62 KB, patch)
2008-09-04 21:10 UTC, Alberto Garcia
none Details | Review

Description Alberto Garcia 2008-04-24 20:07:05 UTC
I'm trying to use gnome-screensaver with a pt_PT@euro locale, and I'm
getting a segmentation fault. Looks like libgnomekbd is trying to set
a tooltip text without making sure that the string is UTF-8.

$ LC_ALL=pt_PT@euro gdb /usr/lib/gnome-screensaver/gnome-screensaver-dialog 
[...]
Program received signal SIGSEGV, Segmentation fault.

Thread 3073247008 (LWP 19435)

  • #0 g_markup_escape_text
    from /usr/lib/libglib-2.0.so.0
  • #1 gtk_widget_set_property
    at /tmp/buildd/gtk+2.0-2.12.9/gtk/gtkwidget.c line 2447
  • #2 g_object_set_valist
    from /usr/lib/libgobject-2.0.so.0
  • #3 g_object_set
    from /usr/lib/libgobject-2.0.so.0
  • #4 IA__gtk_widget_set_tooltip_text
    at /tmp/buildd/gtk+2.0-2.12.9/gtk/gtkwidget.c line 9609
  • #5 gkbd_indicator_set_tooltips
    at gkbd-indicator.c line 99
  • #6 gkbd_indicator_update_tooltips
    at gkbd-indicator.c line 356
  • #7 gkbd_indicator_set_current_page_for_group
    at gkbd-indicator.c line 517
  • #8 gkbd_indicator_set_current_page
    at gkbd-indicator.c line 505
  • #9 gkbd_indicator_init
    at gkbd-indicator.c line 624
  • #10 g_type_create_instance
    from /usr/lib/libgobject-2.0.so.0
  • #11 ??
    from /usr/lib/libgobject-2.0.so.0
  • #12 ??
  • #13 ??

If I set LC_ALL to pt_PT.UTF-8 and then run the gnome-screensaver then it works perfectly.

See also bug #511096.
Comment 1 Josselin Mouette 2008-06-25 15:36:13 UTC
Dupe of bug#449272.
Comment 2 Sergey V. Udaltsov 2008-09-02 20:10:25 UTC
libgnomekbd is using standard gettext for building the tooltips. Should it perform some extra conversion to make sure the result is in utf8? Also, in non-utf locales, why should the tooltip text be in utf?

I do no have non-utf locales, so I cannot really reproduce the bug...
Comment 3 Alberto Garcia 2008-09-02 21:57:15 UTC
I've debugged a bit and I think I've found the problem (and it's in libxklavier actually).

GTK uses UTF-8 internally no matter the user's locale, so 
gtk_widget_set_tooltip_text() expects a string encoded in UTF-8.

To make gettext return UTF-8 strings, you have to force it using bind_textdomain_codeset(). Otherwise it will encode them in the user's locale.

The problem is that libxklavier doesn't make that conversion, all strings are returned in the user's locale, so trying to pass them to GTK widgets is wrong.

The solution is simple: make libxklavier force the output locale in gettext. I'm attaching a sample patch that does this. I've just tested it here and it seems to work.
Comment 4 Alberto Garcia 2008-09-02 21:57:47 UTC
Created attachment 117883 [details] [review]
libxklavier: make gettext return strings in UTF-8
Comment 5 Sergey V. Udaltsov 2008-09-02 22:24:01 UTC
Very well spotted! But I would prefer to keep that fix in libgnomekbd (so that libxklavier would not depend on GTK ideas about utf8). Could you please put bind_textdomain_codeset into gkbd_indicator_global_init (in libgnomekbd/gkbd-indicator.c) and see whether it works for you or not?

Thanks
Comment 6 Alberto Garcia 2008-09-02 22:56:54 UTC
Well, two things about that:

   - UTF-8 is not only a GTK thing, GLib uses UTF-8 for its strings too.
     As a programmer I would expect strings in XklConfigItem (being a
     GObject, so based on glib) to be in UTF-8 rather than the user's locale,
     as I would have to convert them to be able to use them. If everything is
     in UTF-8 the programmer doesn't need to care about encodings, GTK/Glib
     will do the dirty work.

   - To call bind_textdomain_codeset() from libgnomekbd I would need to know
     the value of XKB_DOMAIN, which is private in xklavier_config.c
     In other words, gettext is used _internally_ by libxklavier, it's an
     implementation detail and anyone using that library shouldn't care about
     how libxklavier gets the data. IMHO libgnomekbd shouldn't mess with the
     way libxklavier works internally, so calling bind_textdomain_codeset()
     from there doesn't look like a good idea to me :)
Comment 7 Sergey V. Udaltsov 2008-09-02 23:25:29 UTC
Well, making XKB_DOMAIN public is not a big issue as such. But in general your arguments make sense. I will think of it a bit more...
Comment 8 Alberto Garcia 2008-09-02 23:57:18 UTC
Thanks !

In apps or libraries based on GTK/Glib I would not store strings in an
encoding different from UTF-8 as that encoding is expected
everywhere. I wouldn't do it in general, let alone in a public API
like this one.

Even functions like g_print() or g_debug() expect their input to be in
UTF-8.

If I try this in my Latin-1 system:

  const char *str = "München\n";  /* This is encoded in Latin-1 */
  printf(str);
  g_print(str);

I'll get this output:

  München
  [Invalid UTF-8] M\xfcnchen

So unless there's a good reason why libklavier should return strings
in another encoding I would use UTF-8.

If the user really needs them in a different encoding (which is very
unlikely when using GLib-based code) they can be converted manually.
Comment 9 Sergey V. Udaltsov 2008-09-03 20:28:42 UTC
Ok, committed. Thanks for convincing me:)
Comment 10 Alberto Garcia 2008-09-04 00:35:19 UTC
Unfortunately things are not always that easy :(

While this commit solves two crashes in at least gnome-screensaver and
gnome-keyboard-applet at least, it introduces new bugs in
gnome-keyboard-properties from the control center.

The problem is that that dialog re-encodes the string again, so
instead of displaying, say, "España" it displays "España".

At least it is not a crash (neither one of the two programs I
mentioned before could be used at all) so it is much better than what
we had before, but the problem has to be fixed anyway.

So I can think of two solutions:

   * Revert the last patch in libxklavier and change libgnomekbd to
     convert the strings to utf-8. It's probably enough to change
     these calls:

        *sld = g_strdup (item->short_description);
        *lld = g_strdup (item->description);
        *svd = g_strdup (item->short_description);
        *lvd = g_strdup (item->description);

    in gkb-desktop-config.c to use g_locale_to_utf8() instead of
    g_strdup(), but I haven't tested it yet.

  * Change gnome-keyboard-properties to assume that those strings are
    in UTF-8.

The second option looks saner to me (because of what I explained in
previous comments), but I haven't seen the code much to check if it's
difficult or not.

Anyway I've just checked the documentation of libxklavier and I don't
see any information about the encoding of the strings in
XklConfigItem, so if it's not going to be UTF-8 I'd state it clearly
there to avoid this kind of problems.
Comment 11 Alberto Garcia 2008-09-04 01:23:28 UTC
Just tested it, replacing g_strdup() with g_locale_to_utf8() solved the problem, at least for this case, but I don't think it's a good idea in the end because with this we have strings in different locales depending on the place.

I'd go for my original proposal, which is: everything in UTF-8 from the beginning (that is, from the call to gettext()) and no charset conversion at all.

For the problem in gnome-keyboard-properties I think it's enough if we replace the call to xci_desc_to_utf8() (no conversion is needed at all if everything is in UTF-8 from the start).

I can test it here if you want to, but I'd like to hear your opinion.
Comment 12 Sergey V. Udaltsov 2008-09-04 18:39:38 UTC
Alberto, would you try replacing a call of g_locale_to_utf8 inside xci_desc_to_utf8 to g_strdup? Will it work for you?
Comment 13 Sergey V. Udaltsov 2008-09-04 18:41:10 UTC
Also, in xkb_layout_description_utf8 you need to make similar replacement.
Comment 14 Alberto Garcia 2008-09-04 19:02:10 UTC
Sure, I'll try in a few hours. Also, I think that we could even maintain backward compatibility with something like this:

if (g_utf8_validate (string))
   return g_strdup (string);
 else
   return g_locale_to_utf8 (string);
Comment 15 Alberto Garcia 2008-09-04 21:10:59 UTC
Created attachment 118050 [details] [review]
GNOME keyboard properties: don't re-encode strings that are already UTF-8

I've just tested, it works fine, now everything appears correctly. I'm attaching the patch.

I've also added a test that checks whether the string is UTF-8 or not, to avoid surprises when using and unpatched version of libxklavier.
Comment 16 Sergey V. Udaltsov 2008-09-04 22:23:10 UTC
Thanks for the patch, hopefully we'll close the issue. The only thing is that I don't want to use g_utf8_validate - I just will use strdup (enforcing switch to updated libxklavier, to be released ASAP).
Comment 17 Alberto Garcia 2008-09-04 22:30:20 UTC
As you wish, but take into account that if you use an unpatched version of libxklavier you'll most likely get a segmentation fault.
Comment 18 Sergey V. Udaltsov 2008-09-05 19:26:10 UTC
I think it is ok. We are talking about unstable version of GNOME. When 2.24 is out, the new version of libxklavier will be available for sure.
Comment 19 Sergey V. Udaltsov 2008-09-05 23:30:28 UTC
Committed. Please check from svn.
Comment 20 Alberto Garcia 2008-09-05 23:40:08 UTC
I've just seen it, it looks correct, thanks ! :)
Comment 21 Sergey V. Udaltsov 2008-09-09 19:58:49 UTC
*** Bug 523583 has been marked as a duplicate of this bug. ***
Comment 22 Susana 2008-11-03 19:44:52 UTC
*** Bug 559010 has been marked as a duplicate of this bug. ***
Comment 23 Susana 2008-11-03 19:45:49 UTC
*** Bug 559060 has been marked as a duplicate of this bug. ***
Comment 24 palfrey 2008-12-01 01:02:51 UTC
*** Bug 562787 has been marked as a duplicate of this bug. ***
Comment 25 palfrey 2008-12-02 01:45:28 UTC
*** Bug 562921 has been marked as a duplicate of this bug. ***
Comment 26 palfrey 2008-12-03 13:33:46 UTC
*** Bug 563086 has been marked as a duplicate of this bug. ***
Comment 27 palfrey 2008-12-05 14:01:08 UTC
*** Bug 563287 has been marked as a duplicate of this bug. ***
Comment 28 palfrey 2008-12-15 11:25:49 UTC
*** Bug 564489 has been marked as a duplicate of this bug. ***
Comment 29 palfrey 2009-01-08 14:55:47 UTC
*** Bug 567017 has been marked as a duplicate of this bug. ***
Comment 30 palfrey 2009-01-08 14:56:28 UTC
*** Bug 564358 has been marked as a duplicate of this bug. ***
Comment 31 palfrey 2009-01-08 14:56:28 UTC
*** Bug 565601 has been marked as a duplicate of this bug. ***
Comment 32 palfrey 2009-04-27 14:49:50 UTC
*** Bug 580210 has been marked as a duplicate of this bug. ***
Comment 33 Tobias Mueller 2009-06-30 14:09:57 UTC
*** Bug 508758 has been marked as a duplicate of this bug. ***