GNOME Bugzilla – Bug 120219
gnumeric --help shows UTF-8 and non UTF-8 mixtured.
Last modified: 2004-12-22 21:47:04 UTC
'gnumeric --help' on LANG=ja_JP.eucJP environment, it shows up UTF-8 strings for Gnumeric part and EUC-JP strings for Bonobo/GTK+/GNOME-lib part. Then, users will see scrambled with any text viewer. With LANG=ja_JP.UTF-8, it shows UTF-8 strings for Gnumeric and other parts correctly. So it's assumed that Gnumeric needs some code conversion depending on the charset of the current locale. Looks like #120124, but I can't find where to be fixed. Other gnome app with --help option strings translated , if exists, might help...
Could you tell me a sample English text that is, if I understand the issue correctly, first translated into Japanese/UTF-8 and then printed using printf (or equivalent)?
I can't see your points. (maybe because of my first explanation is not good.. ;-( ) I attach 2 screenshots for easy undestanding... Both are the results of 'gnumeric --help', but one is on LANG=ja_JP.UTF-8 environment and other is LANG=ja_JP.eucJP env.
Created attachment 19375 [details] LANG=ja_JP.eucJP : Gnumeic part ('Application options') is unreadable while GTK+ part shows sane Japanese
Created attachment 19376 [details] LANG=ja_JP.UTF-8 : Gnumeric part shows sane Japanese texts too.
Created attachment 19377 [details] Sample test case for discussion
The above small test case (main.c) simulates 'gnumeric --help' behavior. I've found undocumented popt option 'POPT_ARG_INTL_DOMAIN' in libgnome-2.2.0.1/libgnome/gnome-init.c. Adding POPT_ARG_INTL_DOMAIN line takes effect for this test case, but not for Gnumeric.
Ah, I've found just deleting bind_textdomain_codeset() line in libgnumeric.c will solve this problem. POPT_ARG_INTL_DOMAIN doesn't seem to affect. But bind_textdomain_codeset() line is said what we always need for GNOME2. Hmm...
I've found a fix. The diff is attached. gettext detects the best charset for its output in this order (see loadmsgcat.c, localcharset.c): 1. codeset specified with bind_textdomain_codeset() 2. OUTPUT_CHARSET environment value 3. nl_langinfo(CODESET) or charset alias guess from locale name We need to use bind_textdomain_codeset(PACKAGE, "UTF-8") to fix the gettext output code ( and then, input code to GTK+2 ) to UTF-8. But this is just for GTK+2 GUI, not for terminal messages. We should not fix the output code as UTF-8 before the arg-parse with popt, so that gettext can decide it naturally according to nl_langinfo() or the current locale name, which is also the users' terminal can shows up correctly. The patch will delay bind_textdomain_codeset() until the argument parsing is finished. (setlocale(), bindtextdomain(), textdomain() should still be before the argument parsing)
Created attachment 19401 [details] [review] The patch to delay bind_textdomain_codeset() until argparse ends up.
I had a feeling it would be something like this. I've committed a variant of the patch that will work for the bonobo component too.