GNOME Bugzilla – Bug 457863
zenity calendar shows garbaged days on Arabic locale.
Last modified: 2014-05-23 10:50:48 UTC
zenity calendar shows "%Id" instead of the actual days on Arabic locale because Solaris snprintf() does not work with "%Id". To reproduce: 1. Invoke zenity calendar on ar_EG.UTF-8 % zenity --calendar Then all days are shown as "%Id". ar.po has the following message: msgid "alendar:day:digits|%d" msgstr "%Id" I'm attaching the patch.
Created attachment 91918 [details] [review] Patch for gtk/gtkcalendar.c Attached the patch. Could you review the patch?
Created attachment 91921 [details] Snapshot for this problem
info gettext says the following: As a special feature for Farsi (Persian) and maybe Arabic, translators can insert an `I' flag into numeric format directives. For example, the translation of `"%d"' can be `"%Id"'. The effect of this flag, on systems with GNU `libc', is that in the output, the ASCII digits are replaced with the `outdigits' defined in the `LC_CTYPE' locale facet. On other systems, the `gettext' function removes this flag, so that it has no effect. So I guess the answer is to either use GNU msgfmt or manually strip out the I modifier before builting the translations. The attached patch is not right; we should not add code for this, and not fix the %Id occurrences one by one.
> On other systems, the `gettext' function removes this flag, so that it has no effect. Solaris gettext does not remove the "%I". We haven't used GNU gettext. This option is GNU extension. http://gcc.gnu.org/ml/gcc-patches/2000-08/msg00881.html Did you check my attached screenshot?
The manual strip is not a solution. We have built GNOME automatically with keeping the latest and then the keyword "%Id" is too short to modify the message automatically. e.g. It could happen that some .po files would have "%Id-foo" msgstr for strftime so we cannot simply modify "%Id" to "%d" in the builds.
NOTE: Linux translators won't change the GNU extensions. http://bugzilla.gnome.org/show_bug.cgi?id=404898 Do you have any solutions to resolve this? This problem gives us significant bad impacts as I attached the one of the screenshots. How about additional macro likes Q_() in gi18n-lib.h to remove GNU extensions ?
The right solution is to use tools that can strip out the exensions.
Which tool are you talking about? As I explained, Solaris gettext, libc, and/or C compiler doesn't strip out the extensions. It seems the solution is to remove the "%Id" translation in ar.po in gtk SVN head. However I think the best solution is to remove the code Q_("calendar:week:digits|%d") not to depend on the one locale and one platform.
I checked g_snprintf(). It seems the custom APIs are available. glib/gprintfint.h: #ifdef HAVE_GOOD_PRINTF #define _g_snprintf snprintf #include "gnulib/printf.h" #define _g_snprintf _g_gnulib_snprintf #endif Currently I added the following ugly fix. http://src.opensolaris.org/source/xref/jds/spec-files/trunk/ext-sources/l10n-configure.sh env LANG=C LC_ALL=C \ sed -e 's/^\(msgstr "%\)I\([doxXnfFeEgGaAcspCSm]"\)/\1\2/g' | If you could provide more generic fixes, it would be great.
Created attachment 92111 [details] [review] Patch for gtk/gtkcalendar.c It doesn't work to use printf for the date values. The updated patch works on almost platforms. You can use "%Od" for strftime(). http://www.gnu.org/software/libc/manual/html_node/Formatting-Calendar-Time.html Could you review the patch?
Replacing printf formats with strftime formats is breaking all translations. That is not an option. If you can't handle %Id, and your msgfmt cannot ignore them, I recommend a patch that runs a sed script to remove them before running msgfmt.
This change effects ar.po only and other .po files has used "%d" instead of "%Id". Also the change of ar.po was recent: http://svn.gnome.org/viewcvs/gtk%2B/trunk/po/ar.po?r1=16266&r2=17792 So I think this change doesn't break the translations of all .po files. If you don't like use the same msgid, there are two options. - change msgid too, e.g., from "calendar:day:digits|%d" to "calendar:day:digits2|%d" - We also change ar.po "%Id" to "%Od" with this patch. I have the three problems. - The GtkCalender breaks C standard since it has used GNU extentions. On the other hand, strftime() already provides the localized digits with "%Od" in C specification. - The msgstr is used for the date format so it would be better to use strftime() than printf(). - the msgstr is too short to change the stings simply with a script. It's better to use "calendar:day:digits|%Id" than "%Id". What do you think?
Invoking the C standard is pretty meaningless here. We have always used a number of extensions, and there is nothing illegal about that. I don't see why a simple sed -e s/%Id/%d/g ar.po does not solve your problem. Using %Id in msgids is not a good idea, since it means that it will get used on platforms not supporting it, if there are no translations.
I think all extensions are used for strftime(3C) in GNOME modules except for gtk/gtkcalendar.c so gtkcalendar.c only breaks the printf format in the meaning of C standard. If gtkcalendar.c could choose strftime() instead of printf() likes my patch, then GNU exntensions would be applied in strftime() only for all GNOME modules and we can easily apply the simple sed. > I don't see why a simple sed -e s/%Id/%d/g ar.po does not solve your problem. I mean the simple sed does not work generally because "%I" can be an option for strftime() and "%I" is not a GNU extension in strftime(). As I explained above, "%Id" could be an msgstr for strftime(), e.g. "%Id-foo". So I think if ar.po could be changed 'msgstr "%Id"' to 'msgstr "calendar:day:digits|%Id"', it's a little better. > Using %Id in msgids is not a good idea I don't note msgid but msgstr.
I'll chime in. %Id is needed because in Arabic language, some locales use arabic numerals and some use hindu numerals. We can't force all to use %d and hence arabic numerals. So changing everything to %d in msgstr is not an option. In fact, we're tempted to invite people to open a bug against every GNOME arabic string that does %d instead of %Id. It's certainly on the TODO list to do an audit.
The solution we had in mind was that GNU gettext's msgfmt should strip the 'I' from "%Id" if the target libc doesn't support it.
> %Id is needed because in Arabic language, some locales use arabic numerals and Yes, I understand this but our problem is we cannot remove simple msgid "%ld" with a simple script since it's used with both strftime() and GNU printf(). > The solution we had in mind was that GNU gettext's msgfmt should strip the 'I' from "%Id" if the target libc doesn't support it. We use Solaris msgfmt. I'll check if I can modify the Solaris codes.
Recently we integrated GNU gettext into Solaris. GNU msgfmt 0.16.1 doesn't remove "%Id".
closing out old bugs