GNOME Bugzilla – Bug 343415
GtkCalendar should use LC_MESSAGES for month/weekday labels
Last modified: 2014-12-24 00:27:57 UTC
Consider a case where LC_TIME is different from LC_MESSAGES. Most of the labels the user can see are following the preferred language, except for the month name and weekday abbreviations in GtkCalendar. This makes the UI inconsistent and might cause some confusion. As the calendar is not displaying an actual date in the "%d.%m.%y" sense, but rather the strings are used more as generic labels / column headers, they should be following the same localisation as any other label used in similar fashion, that is LC_MESSAGES.
Created attachment 66478 [details] [review] proposed patch
hm, the argumentation sounds somewhat reasonable to me. albeit, i'm wondering if it would make sense to still spare our translators the additional burden, i.e. by doing this in class_init: SAVED=LC_TIME; LC_TIME=LC_MESSAGES; get_month_names(); LC_TIME=SAVED; granted, it's a bit hackish and i think can't be made thread safe, but spares us 2088 translations (87 languages * 12 month names).
My initial thought was to get the localisation from iso-codes but it doesn't have month names etc. :-/ Rather than messing with catalogs, I suppose one could check whether gettext() gives back the same pointer and fall back to strftime if it does. That is, assuming that lacking the localisation gettext would return the same pointer and not strdup it.
I agree with Tim. Making our translators translate stuff thats already translated in libc is bad. How about using nl_langinfo to get the month and day names ?
Wrong catalog. strftime (at least in glibc) uses nl_langinfo which uses LC_TIME
but using nl_langinfo might still be less hacky than using strftime to obtain the names, even if we have to temporarily frob LC_TIME
Matthias, well the hackiness is the temporary tweaking of LC_TIME. and the reason that's hacky is because doing so is not thread-safe. if libs doesn't allow us to access LC_TIME month names by giving it the LC_MESSAGES locale (e.g. if it took the LC_TIME value as an argument to a function), we won't be able to get around that hackery. beyond that, the current code wouldn't have to change actually...
hmm, glibc has strftime_l, it seems, to work around threadsafety issues.
And there's also nl_langinfo_l (to skip the seconds magic and use ABDAY_1 and such)
hm, have no manual pages for those (yet). do you have online pointers?
google gives you a bunch of online man pages (mostly from darwin, curiously)
Created attachment 66620 [details] nl_langinfo_l example I suppose this should work, provided that using GNU and other extensions is acceptable.
(In reply to comment #12) > Created an attachment (id=66620) [edit] > nl_langinfo_l example ok, for the record, attachment #66620 [details] prints: $ LC_MESSAGES=C ./a.out Sun Mon Tue Wed Thu Fri Sat $ LC_MESSAGES=de_DE ./a.out So Mo Di Mi Do Fr Sa $ LC_TIME=C ./a.out Sun Mon Tue Wed Thu Fri Sat $ LC_TIME=de_DE ./a.out Sun Mon Tue Wed Thu Fri Sat i.e. what we need. > I suppose this should work, provided that using GNU and other extensions is > acceptable. sure, we just need a configure check for nl_langinfo_l() and can then use the normal LC_TIME dependent nl_langinfo() as a fallback.
Created attachment 66915 [details] [review] refactoring a bit
Created attachment 66916 [details] [review] Replace strftime() use with nl_langinfo()
Created attachment 66917 [details] [review] use nl_langinfo_l if available
I attached a set of patches that takes small steps towards the final goal. Some open issues: 1. Is langinfo.h standard enough to be #included unconditionally? 2. How to handle the necessary #define/#includes to get nl_langinfo_l()? - for glibc one needs #define _GNU_SOURCE (which also must be defined before any of the system includes - is it something that can be put in config.h?) and #include <langinfo.h> - but for darwin (I think) it was #include <xlocale.h>
(In reply to comment #15) > Created an attachment (id=66916) [edit] > Replace strftime() use with nl_langinfo() hm... Tommi i'm not sure nl_langinfo() is as portable as strftime, so it might be better to keep that as a nl_langinfo_l() replacement. and we should get Tor to review the changes as well to make sure the W32 stuff stays ok, i'll add him to CC:.
(In reply to comment #17) > I attached a set of patches that takes small steps towards the final goal. > > Some open issues: > > 1. Is langinfo.h standard enough to be #included unconditionally? we could simply do it and fix up things if reports come back in. we *could* also do this with using nl_langinfo() instead of strftime(). that might just mean a wee bit more work in the future for us to reintroduce compat code if it still is required. ;) > 2. How to handle the necessary #define/#includes to get nl_langinfo_l()? > - for glibc one needs #define _GNU_SOURCE (which also must be defined before > any of the system includes simply put #define _GNU_SOURCE // needed for nl_langinfo_l in the line preceeding the first #include statement of the .c file. > - is it something that can be put in config.h?) and > #include <langinfo.h> > - but for darwin (I think) it was #include <xlocale.h> ah! ok, then we shouldn't unconditionally include langinfo.h but also need header file detection with conditional inclusion. can you please attach a comprehensive patch once you're done for final review?
e.g. in gtkpapersize.c, we do #if defined(HAVE__NL_PAPER_HEIGHT) && defined(HAVE__NL_PAPER_WIDTH) #include <langinfo.h> #endif > #define _GNU_SOURCE // needed for nl_langinfo_l Make that #define _GNU_SOURCE /* needed for nl_langinfo_l */
Created attachment 66970 [details] [review] check and use nl_langinfo_l if available To be applied after attachment 66915 [details] [review]
g_locale_to_utf8 (nl_langinfo_l (ABDAY_1 + i, msglocale)... Is this going to work ? nl_langinfo_l probably returns in the encoding of msglocale, while g_locale_to_utf8 goes from the encoding of the current locale (may be != msglocale) to utf8.
It has the same problem as with strftime, it *might* be that charset is different between LC_CTYPE and LC_TIME but how likely is that?
(In reply to comment #23) [...] > it *might* be that charset is > different between LC_CTYPE and LC_TIME but how likely is that? about as likely that any of the LC_* variables differ. since this bug is about differing LC_MESSAGES and LC_TIME, you may just assume that LC_TIME vs. LC_CTYPE is likely to differ. (in particular, a user may choose to *just* alter LC_TIME and leave the rest at =C, because american date syntax sucks so hard)
Actually I think this patch is safer wrt character sets than the current use with strftime. With current code it's the potential difference between LC_CTYPE and LC_TIME, but with this patch it's between LC_CTYPE and LC_MESSAGES.
(In reply to comment #22) > g_locale_to_utf8 (nl_langinfo_l (ABDAY_1 + i, msglocale)... > > Is this going to work ? nl_langinfo_l probably returns in the > encoding of msglocale, while g_locale_to_utf8 goes from the > encoding of the current locale (may be != msglocale) to utf8. well, we could use g_convert() instead, however is there actually any way to find out about the locale that is used by msglocale? and, does nl_langinfo_l() really return its result in a charset other than the one of the current locale?
(In reply to comment #26) > (In reply to comment #22) > > g_locale_to_utf8 (nl_langinfo_l (ABDAY_1 + i, msglocale)... > > > > Is this going to work ? nl_langinfo_l probably returns in the > > encoding of msglocale, while g_locale_to_utf8 goes from the > > encoding of the current locale (may be != msglocale) to utf8. > > well, we could use g_convert() instead, however is there actually any way to > find out about the locale that is used by msglocale? In theory, I suppose, similar trick with newlocale might work; first assign the value of LC_MESSAGES to LC_CTYPE, and then get the CODESET. But that would also be needed for strftime! (If you really are concerned with such things.) > and, does nl_langinfo_l() really return its result in a charset other than the > one of the current locale? Yes, you could try for example fi_FI (latin1), fi_FI@euro (latin9 or 15 or something) and fi_FI.UTF-8
(In reply to comment #27) > (In reply to comment #26) > > (In reply to comment #22) > > > g_locale_to_utf8 (nl_langinfo_l (ABDAY_1 + i, msglocale)... > > > > > > Is this going to work ? nl_langinfo_l probably returns in the > > > encoding of msglocale, while g_locale_to_utf8 goes from the > > > encoding of the current locale (may be != msglocale) to utf8. > > > > well, we could use g_convert() instead, however is there actually any way to > > find out about the locale that is used by msglocale? > > In theory, I suppose, similar trick with newlocale might work; first assign the > value of LC_MESSAGES to LC_CTYPE, and then get the CODESET. But that would also > be needed for strftime! (If you really are concerned with such things.) that won't work, at least not with g_get_charset(). i've investigated the code now, and apart from caching (which would fight your temporary reset), it does: - allow the detected CODESET to be overriden by getenv $CHARSET - figure CODESET via nl_langinfo (CODESET) or from $LC_ALL, $LC_CTYPE, $LANG - unalias the detected CODESET (or locale name) via charset.alias so i think what's needed here is a variant of g_get_charset(), say: gboolean g_locale_get_charset (const char *localename, G_CONST_RETURN char **charset); which uses localename instead of calling _g_locale_charset_raw() and returns the results from g_utf8_get_charset_internal(localename). with the results from g_locale_get_charset ($LC_TIME), the calendar code just needs to call g_convert() on its nl_langinfo_l() results, the same way g_locale_to_utf8() does it for the default locale. i can go ahead and implement g_locale_get_charset() if the above sounds ok to you... Matthias? i'll add Owen to CC: since he might have input on the locale/charset fiddling.
I'd rather just ignore the charset issue for now, because 1) it's not introduced by this patch 2) it hasn't been causing problems with existing code (LC_CTYPE vs. LC_TIME) 3) with the patch it should be even less of a problem (LC_CTYPE vs. LC_MESSAGES) I'm not sure anything is prepared for LC_CTYPE to mismatch with LC_MESSAGES, I'd even go as far as arguing that *all* LC_* are expected to be in the encoding defined by LC_CTYPE Do note that it's - strftime -> g_convert($LC_TIME) - nl_langinfo_l -> g_convert($LC_MESSAGES) and not the other way round.
I think I would be ok with ignoring the possible charset issue for now (considering it was already there, as tko points out). We can do the extra work to implement g_locale_get_charset when the problem actually occurs in the wild. Maybe add a comment pointing out the potential problem.
Another issue was raised is that it should actually be possible to switch locales on runtime (think 770 startup wizard with language and date/time selection, in that order) and it should be possible to get the calendar use the new locale. That means getting the weekday names should be done in every instance_init or so, though that feels awkward. Should the localised messages be moved to instance variables?
This problem is not specific to the calendar widget. Whereever we are accessing translations, it breaks when you change locale at run time, so, getting weekday names in instance_init doesn't do that much.
(In reply to comment #32) > This problem is not specific to the calendar widget. Whereever we are > accessing translations, it breaks when you change locale at run time, so, > getting weekday names in instance_init doesn't do that much. Moving it from class_init to instance_init makes the difference that you're not forced to restart the whole application on locale change, recreating the (calendar) widget is enough. Granted it is virtually never needed (at least until someone thinks it would be a worthwhile addition desktop wide), the only existing application I know of is our startup wizard :)
What I was trying to say was that a whole lot of other things should change for runtime locale changes to take effect correctly.
The one way to possibly handle runtime locale changes is to treat them like screen changes, and reconstruct the entire ui. Of course, applications can keep translated strings in other places than widgets...
in the end, this didn't happen