Bug 484653 – imxim string conversion callback should pass wide-char surrounding

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 484653 - imxim string conversion callback should pass wide-char surrounding


Summary:	imxim string conversion callback should pass wide-char surrounding


Status:	RESOLVED NOTGNOME

Product:	gtk+
Classification:	Platform
Component:	Input Methods
Version:	2.12.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Hidetoshi Tajima
QA Contact:	gtk-bugs

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2007-10-08 09:54 UTC by Theppitak Karoonboonyanan
Modified:	2009-06-05 13:28 UTC

See Also:
GNOME target:	---
GNOME version:	2.19/2.20

Attachments
patch passing surrounding text as wide-char (2.04 KB, patch) 2007-10-08 09:56 UTC, Theppitak Karoonboonyanan	none	Details \| Review
Patch using utf-16 for 2-byte wchar_t (2.42 KB, patch) 2007-10-10 02:43 UTC, Theppitak Karoonboonyanan	none	Details \| Review

Description Theppitak Karoonboonyanan 2007-10-08 09:54:22 UTC

String conversion callback in imxim (see Bug #101814 for the initial patch) currently passes multi-byte surrounding text back to XIM. This appears to be ambiguous, as it can be in any encoding. Unfortunately, Thai XIM in xorg assumes it to be TIS-620, but the g_locale_from_utf8() conversion can result in UTF-8 for UTF-8 locales. As a result, Thai XIM won't work properly with GTK+ apps on th_TH.UTF-8 locale, for example.

It should be safer to pass it as wide-char instead.

Comment 1 Theppitak Karoonboonyanan 2007-10-08 09:56:58 UTC

Created attachment 96868 [details] [review]
patch passing surrounding text as wide-char

Comment 2 Tor Lillqvist 2007-10-08 10:47:07 UTC

Does that patch assume that wchar_t is four bytes? Is that true for all compilers/platforms involved? (It isn't true on Windows, but that is of course not relevant for XIM stuff.)

Comment 3 Theppitak Karoonboonyanan 2007-10-08 11:07:58 UTC

(In reply to comment #2)
> Does that patch assume that wchar_t is four bytes?

I don't think so. For example, I've avoided using memcpy(), but use member-wise copy instead. And sizeof() is also used when calculating buffer size. The rests are based on X protocol headers.

Please correct me if I still miss some points.

Comment 4 Tor Lillqvist 2007-10-08 20:47:46 UTC

I think the code won't work properly for non-BMP characters in case wchar_t is two bytes (and wchar_t strings in XIM are supposed to be in UTF-16).

Are there such platforms/XIM implementations? If yes, your code probably needs some ifdefs: in case sizeof(wchar_t)==2 you should instead of calling g_utf8_to_ucs4() call g_utf8_to_utf16(), and "text" should then be wchar_t* instead of gunichar*.

Comment 5 Theppitak Karoonboonyanan 2007-10-09 04:40:11 UTC

AFAIK, Thai XIM in libX11 seems to be the only implementation that uses this callback. But sure there can be other XIM servers implemented as separate processes.

So, I now wonder that, as XIM server implementor, what one should expect the wide-char string from the client to be encoded in, between UCS4 and UTF-16. Note that the XIM server can be across the network. So, wchar_t can be different from what the client is running on.

For the particular case of Thai XIM, though, the server and client are fortunately in the same process. So, different wchar_t size is not a problem.

Comment 6 Theppitak Karoonboonyanan 2007-10-10 02:43:13 UTC

Created attachment 96976 [details] [review]
Patch using utf-16 for 2-byte wchar_t

Ignoring the protocol question, this should work for Thai XIM.

Comment 7 Theppitak Karoonboonyanan 2007-10-10 02:52:31 UTC

Altertatively, I have also filed a bug against xorg to make Thai XIM convert the multi-byte text based on locale before using it:

  https://bugs.freedesktop.org/show_bug.cgi?id=12759

(Probably, other toolkits than GTK+ may still pass multi-byte surrounding.)

Comment 8 Theppitak Karoonboonyanan 2009-06-05 13:28:32 UTC

Resolving this bug as NOTGNOME. Given that the XIM properly converts the multi-byte surrounding text based on current locale, this problem should be gone.