GNOME Bugzilla – Bug 484653
imxim string conversion callback should pass wide-char surrounding
Last modified: 2009-06-05 13:28:32 UTC
String conversion callback in imxim (see Bug #101814 for the initial patch) currently passes multi-byte surrounding text back to XIM. This appears to be ambiguous, as it can be in any encoding. Unfortunately, Thai XIM in xorg assumes it to be TIS-620, but the g_locale_from_utf8() conversion can result in UTF-8 for UTF-8 locales. As a result, Thai XIM won't work properly with GTK+ apps on th_TH.UTF-8 locale, for example. It should be safer to pass it as wide-char instead.
Created attachment 96868 [details] [review] patch passing surrounding text as wide-char
Does that patch assume that wchar_t is four bytes? Is that true for all compilers/platforms involved? (It isn't true on Windows, but that is of course not relevant for XIM stuff.)
(In reply to comment #2) > Does that patch assume that wchar_t is four bytes? I don't think so. For example, I've avoided using memcpy(), but use member-wise copy instead. And sizeof() is also used when calculating buffer size. The rests are based on X protocol headers. Please correct me if I still miss some points.
I think the code won't work properly for non-BMP characters in case wchar_t is two bytes (and wchar_t strings in XIM are supposed to be in UTF-16). Are there such platforms/XIM implementations? If yes, your code probably needs some ifdefs: in case sizeof(wchar_t)==2 you should instead of calling g_utf8_to_ucs4() call g_utf8_to_utf16(), and "text" should then be wchar_t* instead of gunichar*.
AFAIK, Thai XIM in libX11 seems to be the only implementation that uses this callback. But sure there can be other XIM servers implemented as separate processes. So, I now wonder that, as XIM server implementor, what one should expect the wide-char string from the client to be encoded in, between UCS4 and UTF-16. Note that the XIM server can be across the network. So, wchar_t can be different from what the client is running on. For the particular case of Thai XIM, though, the server and client are fortunately in the same process. So, different wchar_t size is not a problem.
Created attachment 96976 [details] [review] Patch using utf-16 for 2-byte wchar_t Ignoring the protocol question, this should work for Thai XIM.
Altertatively, I have also filed a bug against xorg to make Thai XIM convert the multi-byte text based on locale before using it: https://bugs.freedesktop.org/show_bug.cgi?id=12759 (Probably, other toolkits than GTK+ may still pass multi-byte surrounding.)
Resolving this bug as NOTGNOME. Given that the XIM properly converts the multi-byte surrounding text based on current locale, this problem should be gone.