GNOME Bugzilla – Bug 721120
GtkIMContextSimple needs locale compose tables similar with X11.
Last modified: 2015-10-13 02:51:16 UTC
When user uses GTK+ applications with LC_CTYPE, they expects the locale compose tables are available in GTK+ too, which are defined in /usr/share/X11/locale/*/Compose. E.g. dead_grave + space gives U+1FEF on el_GR.UTF-8. I'd think to call gtk_im_context_simple_add_table by locale.
Created attachment 264925 [details] [review] Patch for gtkimcontextsimple.c The patch generates gtkimcontextcomposeloc.h by gencomposetable which converts /usr/share/X11/locale/$locale/Compose files to GtkComposeTable and calls gtk_im_context_simple_add_table() in gtk_im_context_simple_init() if LC_CTYPE matches el_gr, fi_fi or pt_br.
Created attachment 266426 [details] [review] Patch for gtkimcontextsimple.c A minor update is applied.
Created attachment 267917 [details] [review] Patch for gtkimcontextsimple.c Merged the recent update of libX11/fi_FI: http://cgit.freedesktop.org/xorg/lib/libX11/commit/?id=20fdccd8
Created attachment 278661 [details] [review] Patch for gtkimcontextsimple.c Merged the recent update of libX11/pt_BR: http://cgit.freedesktop.org/xorg/lib/libX11/commit/?id=e3dc0d1733
Created attachment 289628 [details] [review] Patch for gtkimcontextsimple.c Updated the patch with HEAD.
Review of attachment 289628 [details] [review]: I am principle ok with having some extra tables like this. But I think the locale matching must be a bit more flexible. Shouldn't the finnish additions used for any fi_xx locale, e.g ? Maybe use g_get_language_names ?
Created attachment 290007 [details] [review] Patch for gtkimcontextsimple.c I added g_get_language_names(). However the API uses LC_MESSAGES. I think LC_CTYPE is better for compose keys. Probably users could run 'env LC_ALL=fi_FI.UTF-8 LC_MESSAGES=en_US.UTF-8 gedit' until glib handles LC_CTYPE.
Created attachment 290399 [details] [review] Patch for gtkimcontextsimple.c Revised the patch.
(In reply to Takao Fujiwara from comment #7) > > However the API uses LC_MESSAGES. I think LC_CTYPE is better for compose > keys. Agreed, LC_CTYPE defines character classes and such which relate to compose keys more than LC_MESSAGES. > Probably users could run 'env LC_ALL=fi_FI.UTF-8 LC_MESSAGES=en_US.UTF-8 > gedit' until glib handles LC_CTYPE. I think you meant 'env LANG=fi_FI.UTF-8 LC_MESSAGES=en_US.UTF-8 gedit'. Looks like the patch is ready to be applied? Thanks.
Recently I created another patch to load /usr/share/X11/locale/$LOCALE/Compose file and $HOME/.XCompose dynamically instead of having the static data table so I will update the patch later.
Created attachment 308724 [details] [review] Patch for gtkimcontextsimple.c Updated the patch to load /usr/share/X11/locale/$LOCALE/Compose dynamically.
Review of attachment 308724 [details] [review]: Thanks for your patch. I've gone over it for a quick review, but I have yet to test it… ::: gtk/gtkcomposetable.c @@ +28,3 @@ +#include "gtkimcontextsimple.h" + +int MAX_COMPOSE_LEN = 0; Use static, here. Also, global variables do not need to be uppercased. @@ +29,3 @@ + +int MAX_COMPOSE_LEN = 0; +int N_INDEX_STRIDE = 0; Same as above: use static, and do not uppercase. As a general thing: do these need to be global variables at all? Does not seem like, to me. @@ +34,3 @@ +is_codepoint (const gchar *str) +{ + gboolean retval = (strlen (str) > 1 && *str == 'U'); This could really be turned into: if (str[0] == '\0' || str[0] != 'U') return FALSE; @@ +44,3 @@ + if (g_ascii_isxdigit (str[i])) + continue; + else Flip the condition check and drop the `else`, i.e.: if (!g_ascii_isxdigit (str[i])) return FALSE; @@ +66,3 @@ + +static gboolean +parse_compose_value (GArray *array, const gchar *val, GError **error) Coding style: line breaks after each argument. @@ +86,3 @@ + if (uch == 0) + { + g_set_error (error, Since all these functions are just used locally, there's really no need to employ GError here. Just use a `g_warning()` instead. @@ +109,3 @@ + if (g_utf8_get_char (g_utf8_next_char (words[1])) > 0) + { + g_set_error (error, Same as above: do not use GError. @@ +133,3 @@ + +static gboolean +parse_compose_sequence (GArray *array, const gchar *seq, GError **error) Coding style: line breaks after each argument. @@ +161,3 @@ + if (start == NULL || end == NULL || end <= start) + { + g_set_error (error, Same as above: use `g_warning()` instead of GError. @@ +189,3 @@ + if (0 == n || n >= GTK_MAX_COMPOSE_LEN) + { + g_set_error (error, Same as above: use `g_warning()` instead of GError. @@ +236,3 @@ + { + g_warning ("%s: %s", error->message, line); + g_clear_error (&error); Instead of duplicating the clean up path, you can use a `goto out` here and in the error path of `parse_compose_value()` as well. @@ +268,3 @@ + if (!g_file_get_contents (compose_file, &contents, &length, &error)) + { + g_error ("%s", error->message); g_error() will call abort(). Pretty sure we don't want that. @@ +358,3 @@ + sizeof (guint16) * row_stride, + compare_seq); + // g_debug ("seq = %p", seq); Coding style: stray debug line, and C99 comment. @@ +360,3 @@ + // g_debug ("seq = %p", seq); + + if (!seq) You should flip the if condition and drop the various else branches: if (seq) { gunichar value = … return compose_buffer[n_compose + 1] == value } return FALSE; @@ +362,3 @@ + if (!seq) + { + // g_debug ("no\n"); Coding style: stray debug line, and C99 comment. @@ +368,3 @@ + { + gunichar value = seq[row_stride - 1]; + // g_debug ("U+%04X\n", value); Coding style: stray debug line, and C99 comment. @@ +377,3 @@ + +static gboolean +check_normalize_nfc (gunichar* combination_buffer, gint n_compose) Coding style: line break after each argument. @@ +393,3 @@ + if (combination_buffer[0] >= 0x390 && combination_buffer[0] <= 0x3FF) + { + for (i = 1; i < n_compose; i++ ) There are a lot of magic constants here; can they be macroised? Or, at least, could you leave a comment as to what they mean or where they come from? @@ +413,3 @@ + memcpy (combination_buffer, + combination_buffer_temp, + GTK_MAX_COMPOSE_LEN * sizeof (gunichar) ); Coding style: stray space before the closing parenthesis. @@ +424,3 @@ + g_free (nfc_temp); + + if (n_compose > 2) Swap the condition and drop the else branch: if (n_compose <= 2) break; temp_swap = … @@ +428,3 @@ + temp_swap = combination_buffer_temp[i % (n_compose - 1) + 1]; + combination_buffer_temp[i % (n_compose - 1) + 1] = + combination_buffer_temp[(i+1) % (n_compose - 1) + 1]; Coding style: missing spaces between operands "(i + 1)". @@ +429,3 @@ + combination_buffer_temp[i % (n_compose - 1) + 1] = + combination_buffer_temp[(i+1) % (n_compose - 1) + 1]; + combination_buffer_temp[(i+1) % (n_compose - 1) + 1] = temp_swap; Coding style: missing spaced between operands "(i + 1)". @@ +568,3 @@ + >k_compose_table_compact, + n_compose)) + removed_list = g_list_append (removed_list, array); Coding style: use curly braces for the if and else if blocks. Also, g_list_append() is algorithmically expensive; since ordering in the removed_list is not influential, use g_list_prepend() instead. @@ +571,3 @@ + + else if (check_algorithmically (keysyms, n_compose)) + removed_list = g_list_append (removed_list, array); Same as above, use curly braces for the block, and use g_list_prepend() instead of g_list_append(). @@ +576,3 @@ + for (list = removed_list; list != NULL; list = list->next) + { + GArray *array = (GArray *) list->data; Coding style: no need to cast list->data. @@ +594,3 @@ + for (list = compose_list; list != NULL; list = list->next) + { + GArray *array = (GArray *) list->data; Coding style: no need to cast list->data. @@ +607,3 @@ + if (codepoint > 0xffff) + { + removed_list = g_list_append (removed_list, array); Use g_list_prepend() instead of g_list_append(). @@ +731,3 @@ + GList *list; + int i; + int TOTAL_SIZE = 0; Coding style: lower case TOTAL_SIZE. ::: gtk/gtkcomposetable.h @@ +20,3 @@ + + +#if !defined (__GTK_H_INSIDE__) && !defined (GTK_COMPILATION) This is a private header, so it does not need single inclusion guards. ::: gtk/gtkimcontextsimple.c @@ +40,1 @@ +#define X11_DATADIR "/usr/share/X11/locale" This should likely be: #define X11_DATADIR PREFIX "/share/X11/locale" @@ +158,3 @@ { im_context_simple->priv = gtk_im_context_simple_get_instance_private (im_context_simple); + g_idle_add (init_compose_table, im_context_simple); This requires the main loop to be spinning. I'm not entirely sure we want to do this. You should use a GTask, instead, and thread off the compose table loading. @@ +1260,3 @@ } + +/** Please, remove the double asterisk: this is not a public API. ::: gtk/gtkimcontextsimple.h @@ +70,3 @@ gint max_seq_len, gint n_seqs); +gboolean gtk_im_context_simple_add_compose_file Coding style: do not break lines like this. Please, place the arguments on the same line as the function name.
Created attachment 308778 [details] [review] Patch for gtkimcontextsimple.c I updated the patch.
*** Bug 155010 has been marked as a duplicate of this bug. ***
Some high-level comments: 1) Looking through /usr/share/X11/locale, I wonder if all this work is worth it - most of the Compose files are empty. And those which aren't are mostly small additions on top of the en_US one. 2) If we want to treat this as an X11 compat thing, and look in all the same locations as libX11 and even use XCOMPOSEFILE, it should only be done when we are running under X11, not under other platforms. If we want to treat this as a true GTK+ feature, we should have our own location for a file with user overrides, in ~/.config/gtk-3. 3) This is a lot of work for just a few compose sequences. We should do it only once, and reuse the table, not once per entry / im context. Looking at the code: 4) Storing codepoints as strings and repeatedly calling get_codepoint() seems really sad - I would recommend to do the parsing once, do it early, and store the codepoints 5) I don't like how this copies a bunch of functionality from gtkimcontextsimple.c (check_algorithmically, check_normalize_nfc, etc) - we should make this private api and only have it implemented once. 6) There is a bunch of debug code in there, which should be removed or at least put in #ifdef DEBUG. Here is an alternative idea: maybe we want to cache the translated tables on disk and just memmap the ready-made compact table, in the same format as the one we currently compile into libgtk ?
(In reply to Matthias Clasen from comment #15) > 1) Looking through /usr/share/X11/locale, I wonder if all this work > is worth it - most of the Compose files are empty. And those which aren't > are mostly small additions on top of the en_US one. I'm thinking the following change. const gchar * const sys_langs[] = { "el_gr", "fi_fi", "pt_br", NULL }; const gchar * const *sys_lang = NULL; for (lang = langs; *lang; lang++) { if (g_str_has_prefix (*lang, "en_US")) break; if (**lang == 'C') break; for (sys_lang = sys_langs; *sys_lang; sys_lang++) { if (g_ascii_strncasecmp (*lang, *sys_lang , strlen (*sys_lang)) == 0) { path = g_build_filename (X11_DATADIR, *lang, "Compose", NULL); break; } } if (path == NULL) continue; if (g_file_test (path, G_FILE_TEST_EXISTS)) break; g_free (path); path = NULL; } > > 2) If we want to treat this as an X11 compat thing, and look in all the > same locations as libX11 and even use XCOMPOSEFILE, it should only > be done when we are running under X11, not under other platforms. > > If we want to treat this as a true GTK+ feature, we should have our > own location for a file with user overrides, in ~/.config/gtk-3. I tested GNOME Wayland and it seems /usr/X11/locale/$locale/Compose and $HOME/.XCompose files are loaded. So if the GDK backend is X11 or Wayland, the X11 compat thing can be used? > > 3) This is a lot of work for just a few compose sequences. We should do it > only once, and reuse the table, not once per entry / im context. My understanding is you suggest to use gtk_im_context_simple_class_init() instead of gtk_im_context_simple_init(). > > Here is an alternative idea: maybe we want to cache the translated tables on > disk and just memmap the ready-made compact table, in the same format as the > one we currently compile into libgtk ? Thanks for the various suggestions. - If the cache does not exist, the cache is saved. - If the compose table is later updated than the cache, the cache is updated.
Created attachment 309455 [details] [review] Patch for gtkimcontextsimple.c Updated the patch.
Created attachment 309474 [details] [review] Patch for gtkimcontextsimple.c Fixed some typo.
Created attachment 309799 [details] [review] Patch for gtkimcontextsimple.c Added a magic number check.
This is getting closer. Here are a few more hi-level comments for the general direction of the work: 1) The static counter of im contexts looks ugly to me. I would suggest to just not bother with cleaning up the tables - once they are loaded, just keep them around forever. 2) I think it would be nicer to move the caching of tables to gtkcomposetable.c, and add a call like GtkComposeTable *gtk_compose_table_for_file (const char *compose_file) 3) For the caching, I think it would be nice to have a solution that can detect when the original changed. One approach would be content-addressing, using e.g. a sha128 of the content as filename for the cache. 4) It would be nice to break this up into multiple commits: - export functions you need from gtkimcontextsimple.c - add gtkcomposetable functionality - hook composetable stuff into gtkimcontextsimple.c
Review of attachment 309799 [details] [review]: ::: gtk/gtkcomposetable.c @@ +37,3 @@ + gboolean *compose_finish, + gboolean *compose_match, + gunichar *output_char); I would put those in a private header, say gtkimcontextsimple.h ::: gtk/gtkimcontextsimple.c @@ +217,1 @@ } Not the right ay to check for a display. better would be GDK_IS_WAYLAND_DISPLAY (display) || GDK_IS_X11_DISPLAY (display) But even so, we really should be checking the display of the widget that the im context will belong to. @@ +294,1 @@ const guint16 *seq = value; All the guint -> guint16 changes should probably be in a separate preparatory commit @@ +516,3 @@ + gboolean *compose_finish, + gboolean *compose_match, + gunichar *output_char) Looking at the function name, I would put the table first in the parameter list, and treat this more like a method of GtkComposeTableCompact
Created attachment 310171 [details] [review] Patch to export _gtk_check_compact_table() Added export functions in gtkimcontextsimple.c
Created attachment 310172 [details] [review] Change uint to uint16 Replaced uint with uint16.
Created attachment 310173 [details] [review] Patch to add gtkcomposetable.c Added gtk_compose_table_new_with_file() and gtk_compose_table_list_add_file().
Created attachment 310174 [details] [review] Patch to use GtkCompose APIs in GtkIMContextSimple Connect GtkIMContextSimple to GtkCompose.
(In reply to Matthias Clasen from comment #20) > 3) For the caching, I think it would be nice to have a solution that can > detect when the original changed. One approach would be content-addressing, > using e.g. a sha128 of the content as filename for the cache. I'm not sure if I understand the suggestion correctly. I use both the compose file name and the compose data as the cache file name. I found nettle-devel provides sha1.h and sha2.h I use g_str_hash() at the moment. Which option is better? I use GFileMonitor for the monitoring.
Review of attachment 310171 [details] [review]: ::: gtk/gtkimcontextsimpleprivate.h @@ +52,3 @@ + 30, + 6 +}; I don't think the entire struct should be defined in the header. This should just be extern const GtkComposeTableCompact gtk_compose_table_compact; With the actual definition staying in gtkimcontextsimple.c
Review of attachment 310172 [details] [review]: ok
Review of attachment 310173 [details] [review]: ::: gtk/gtkcomposetable.c @@ +953,3 @@ + +monitor: + compose_file_monitor (compose_file, compose_table); I don't like to introduce file monitoring for config files here. We don't do that for other config files, either. And having a directory monitor on $HOME in every gtk application is going to be a performance drag. Lets drop the file monitoring from this, at least for now.
Review of attachment 310174 [details] [review]: ::: gtk/gtkimcontextsimple.c @@ +19,3 @@ +#include <gdk/gdkx.h> +#include <wayland/gdkwayland.h> These need to be wrapped in #ifdef GDK_WINDOWING_X11 and #ifdef GDK_WINDOWING_WAYLAND, respectively. @@ +168,3 @@ + gtk_im_context_simple_add_compose_file (im_context_simple, path); + g_free (path); + path = NULL; I would prefer to not read an X environment variable here. Instead, we should maybe have a non-legacy location for a compose file in ~/.config/gtk-3.0/Compose. @@ +1330,3 @@ + if (GDK_IS_WAYLAND_DISPLAY (display) || GDK_IS_X11_DISPLAY (display)) + init_compose_table_async (im_context_simple, NULL, NULL, NULL); +} This needs to be wrapped in #ifdef GDK_WINDOWING_X11 and #ifdef GDK_WINDOWING_WAYLAND, respectively
Ideally, we would document the fact somewhere that these files are read by GTK+, but I don't want to drag this out too much more, lets worry about documentation later.
Created attachment 310564 [details] [review] Patch to export _gtk_check_compact_table() Added export functions in gtkimcontextsimple.c
Created attachment 310565 [details] [review] Change uint to uint16 Replaced uint with uint16.
Created attachment 310566 [details] [review] Patch to add gtkcomposetable.c Added gtk_compose_table_new_with_file() and gtk_compose_table_list_add_file().
Created attachment 310567 [details] [review] Patch to use GtkCompose APIs in GtkIMContextSimple Connect GtkIMContextSimple to GtkCompose.
Created attachment 310568 [details] [review] Patch to use GtkCompose APIs in GtkIMContextSimple Connect GtkIMContextSimple to GtkCompose.
I updated the patches.
Review of attachment 310564 [details] [review]: Looks generally good. ::: gtk/gtkimcontextsimpleprivate.h @@ +41,3 @@ + * In future versions it will be just the keysym (no +1). + */ +#define IS_DEAD_KEY(k) \ Should probably namespace this macro, e.g. GDK_KEY_IS_DEAD_KEY, to avoid potential collisions. @@ +44,3 @@ + ((k) >= GDK_KEY_dead_grave && (k) <= (GDK_KEY_dead_dasia+1)) + +gboolean _gtk_check_algorithmically (const guint *compose_buffer, There's no need to use an underscore to signal a private function: all symbols are automatically private, and public symbols must be annotated. @@ +47,3 @@ + gint n_compose, + gunichar *output); +gboolean _gtk_check_compact_table (const GtkComposeTableCompact *table, Same as above.
Created attachment 310571 [details] [review] Patch to export _gtk_check_compact_table() Added export functions in gtkimcontextsimple.c
Created attachment 310572 [details] [review] Change uint to uint16 Replaced uint with uint16.
Created attachment 310573 [details] [review] Patch to add gtkcomposetable.c Added gtk_compose_table_new_with_file() and gtk_compose_table_list_add_file().
Created attachment 310574 [details] [review] Patch to use GtkCompose APIs in GtkIMContextSimple Connect GtkIMContextSimple to GtkCompose.
(In reply to Emmanuele Bassi (:ebassi) from comment #38) > Review of attachment 310564 [details] [review] [review]: > ::: gtk/gtkimcontextsimpleprivate.h > @@ +41,3 @@ > + * In future versions it will be just the keysym (no +1). > + */ > +#define IS_DEAD_KEY(k) \ > > Should probably namespace this macro, e.g. GDK_KEY_IS_DEAD_KEY, to avoid > potential collisions. I noticed IS_DEAD_KEY() does not have to be exported now and deleted lines in my patch.
(In reply to Matthias Clasen from comment #29) > Review of attachment 310173 [details] [review] [review]: > > ::: gtk/gtkcomposetable.c > @@ +953,3 @@ > + > +monitor: > + compose_file_monitor (compose_file, compose_table); > > I don't like to introduce file monitoring for config files here. We don't do > that for other config files, either. And having a directory monitor on $HOME > in every gtk application is going to be a performance drag. Lets drop the > file monitoring from this, at least for now. Ah, and now I realize how this came to be. When I said I want to note changes to the cache files, I just meant that it would be nice to not get stuck with stale cache files, not runtime monitoring. The stale cache problem is already taken care of by your timestamp comparison.
(In reply to Matthias Clasen from comment #44) > Ah, and now I realize how this came to be. When I said I want to note > changes to the cache files, I just meant that it would be nice to not get > stuck with stale cache files, not runtime monitoring. The stale cache > problem is already taken care of by your timestamp comparison. Yeah, that's why I added GFileMonitor. Currently I prepend /usr/share/X11/$locale/Compose, $HOME/.XCompose and $HOME/.config/gtk-3.0/Compose. But it seems xterm loads only one of them when xterm detects the first one. I'm not clarified all or one from the man Compose(5). Probably I should follow X11.
Created attachment 310783 [details] [review] Patch to add gtkcomposetable.c Some static method names are changed.
Created attachment 310784 [details] [review] Patch to use GtkCompose APIs in GtkIMContextSimple g-i-c-s returns if user compose file is found.
Pushed with some small fixups and added documentation, thanks.
Thank you for your code reviews.