GNOME Bugzilla – Bug 749206
GDateTime: month names in the genitive case
Last modified: 2018-02-28 11:36:50 UTC
Slavic languages, like Polish, use different cases for month names when they are on their own (nominative, "styczeń") and when they are part of a date (genitive, "11 stycznia"). Currently we can only use nominative forms in GNOME, so calendars use ungrammatical forms. KDE (and Windows) seems to have it fixed, so their desktops look much more professional in Slavic locales. It would be great if gdate.c/gdatetime.c could provide support for genitive in dates. Unfortunately, glibc developers are not interested in solving this on their level [1][2]. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=10871 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=15606
CLDR provides information on standalone and format month names: http://cldr.unicode.org/translation/date-time#TOC-Stand-Alone-vs.-Format-Styles KDE seems to have implemented this back in 2011: https://community.kde.org/KDE_Core/QtMerge/QDateTime GNOME is the only big desktop without the proper genitive month names. I regularly get bug reports about this issue. I know it may not seem like it, but this has to be fixed somehow if we expect to be treated seriously in many countries in (mostly Central and Eastern) Europe.
Given that this is fixed in glibc for Ukranian, and that the glibc maintainers did not actually say "we are not interested in solving this" (but rather, merely failed to get around to dealing with it), it seems wrong to try to hack around this for other languages at the glib level. I think you should try to push this harder in glibc. (Having a single patch that covers all of the relevant countries might help get it in.)
The Ukrainian locale "fix" (and my proposition for Polish) are hacks, dirty workarounds really. I would gladly provide a patch adding universal support for standalone and format month names, but I'm not a programmer. And that seems to be the real blocker here: translators cannot fix it properly due to limited technical knowledge (for another example, see https://sourceware.org/bugzilla/show_bug.cgi?id=10871#c6), and developers are not interested in fixing it. Is there any way to encourage developers to look into this issue?
I'll be happy to provide a complete solution for this problem. Unfortunately, it must be provided both on the glibc and glib side because: * glib provides a function g_date_strftime() which just calls strftime() from glibc; * from this commit: https://git.gnome.org/browse/glib/commit/glib/gdate.c?id=13e2070 (year 2007, glib 2.12.10 and above) g_date_strftime() has a custom implementation for Windows; see also: bug 404832; * glib also provides GDateTime API (year 2010, glib 2.26.0 and above) and a function g_date_time_printf() which initially provided a fully custom implementation independent on glibc; see the commit: https://git.gnome.org/browse/glib/commit/glib/gdatetime.c?id=e1f13ee and bug 50076; * however, from this commit: https://git.gnome.org/browse/glib/commit/?id=527dc86 (2011, glib 2.30.0 and above) g_date_time_printf() retrieves the month names from nl_langinfo() when available, which means on glibc. If this bug is fixed in glibc only then it will work fine on Linux but not on other platforms. If it is fixed in glib only then it will work fine on other platforms but not on Linux. I'm going to try to provide a solution for glibc first and then when it is acceptable I'd like to port it to glib where needed.
(In reply to Rafal Luzynski from comment #4) > If this bug is fixed in glibc only then it will work fine on Linux but not > on other platforms. That's not necessarily a problem. Or at least, it's not necessarily *our* problem; there are lots of places where glib behaves better on Linux than on other platforms, due to having better APIs to work with on Linux. > If it is fixed in glib only then it will work fine on > other platforms but not on Linux. Well, GDateTime could be rewritten to not use nl_langinfo() for %c/%x if genitive month name translations exist.
Just a note, that this bug affects also Greek language.
Every affected language will be included, or at least will have a chance to be included. For now please check https://sourceware.org/bugzilla/show_bug.cgi?id=10871
It's misleading to speak of “month names in the genitive case”. It's about month names suitable for use in full date formats (with a day-in-month number). This change is not about adding support month names in the genitive case for languages which use an unmodified month name in full date formats. Romance languages with elision have a similar issue, e.g. it's “d'abril“ but “de novembre”. CLDR suggests to put the article in the month names. But if we do this for the existing month names (as has been suggested as POSIX change), a lot of date formatting code is broken as the result. (Existing locales usually use "de %B", without elision.) My preference (for glibc at least) is to leave the existing month names unchanged, add a set of new month names suitable for inclusion in full-format date strings, and adjust format strings where necessary.
Created attachment 348636 [details] [review] Fixes the problem (proposition) Please do not yet commit this patch unless it really fixes the problem. The patch introduces a support of %OB, %Ob, and %Oh format specifiers. It is related with https://sourceware.org/bugzilla/show_bug.cgi?id=10871 but does not depend on it strictly. Although feel free to test. Why not to commit: * In glibc bugzilla it has been accepted that %OB specifier should be added and it should return the same string as nl_langinfo(ALTMON_n). It has also been accepted that %Ob and %Oh should be introduced. However, it has not yet been decided if "alternative" means "standalone/nominative" or "in full date context/genitive". * Also it has not been decided what should be the names of nl_langinfo() arguments which return the locale data for %Ob and %Oh, my suggestion is _NL_ABALTMON_n. * For the systems which do not have the glibc bug fixed all translations must be updated (really all). * It may break some applications which display month names standalone (e.g., calendars), they will start displaying the month names incorrectly. The applications will need some minor fixes. * win32_strftime_helper () (strftime () implementation for Windows) should also be fixed. However, here is what is good: * Works perfectly on the Linux systems which have the glibc bug fixed. * On other systems provides the custom implementation so it works the same on all operating systems. * nl_langinfo (ALTMON_n) feature is supported by *BSD systems and it is used correctly (I hope so because I have not tested). * %OB format specifier is supported by *BSD family including OS X (and probably iOS), this implementation is consistent with BSD. However, AFAIK nl_langinfo(ALTMON_n) is not supported on OS X. * This patch automagically fixes the problem in all applications which use g_date_time_format() correctly, i.e., use "%B" or "%b" to display full date, or at least a date containing a day number and a month name.
(In reply to Rafal Luzynski from comment #9) > Please do not yet commit this patch unless it really fixes the problem. Sorry, a typo. I meant: "Please do not yet commit this patch ALTHOUGH it really fixes the problem." FWIW, here is a copr repo: https://copr.fedorainfracloud.org/coprs/rluzynski/genitive/ which contains prebuilt binaries with glibc patches applied and soon will also contain glib2 binaries with this patch applied so those brave enough will have a chance to test.
Created attachment 367206 [details] [review] Fixes the problem (working, rebased, v2) This is a rebased and working version. Please note that the patches for the relevant glibc bug have just been pushed so it is OK to commit this now. Please don't close this bug as fixed, the update for win32_strftime_helper (Windows implementation) is still needed and I'll provide it ASAP.
Created attachment 367260 [details] [review] Also fixes the problem for Windows This is the missing patch for Windows: win32_stftime_helper() function. I must admit I have not tested this thoroughly, my access to Windows is limited. Still I think that the functions g_date_prepare_to_parse() and g_date_set_parse() need the update so please review and commit if the patches are OK but do not yet close the bug.
Review of attachment 367206 [details] [review]: It would be good to add a test for this (probably requires you to add polish translations first, and then check that we get the expected strings in that locale, for the new format strings. ::: configure.ac @@ +1246,3 @@ + [glib_cv_langinfo_altmon=no])]) +if test x$glib_cv_langinfo_altmon = xyes; then + AC_DEFINE(HAVE_LANGINFO_ALTMON,1,[Have nl_langinfo (ALTMON_n)]) Not sure this is really needed - afaik, the ALTMON_x values are guaranteeed to be defines as well, so you could just check them directly with #ifdef. ::: glib/gdatetime.c @@ +216,3 @@ + * format forms but if nl_langinfo (ALTMON_n) is not supported then we will + * have use MONTH_FULL as standalone. The same if nl_langinfo () does not + * exist at all. MONTH_ABBR is similar. */ Pet peeve of mine: I prefer the */ on the next line, aligned with the other * @@ +235,2 @@ case 2: + return C_("full month name standalone", "February"); I think it would be better not to change the context strings here - this invalidates all existing translations, for not much gain at all. Not to mention that it blows up the size of the patch... @@ +398,3 @@ + * in a full date context. Here are full month names in a form + * appropriate when they are used in a full date context, with the + * day number. */ I think it would be really helpful to include an example or two in the comment here. "full month name with day" is not really clear to me, and having the default strings be just the same is not helpful for figuring out what is meant here.
Review of attachment 367206 [details] [review]: .
(In reply to Matthias Clasen from comment #13) > Review of attachment 367206 [details] [review] [review]: > > It would be good to add a test for this This makes sense. > (probably requires you to add polish > translations first, It's just been added to glibc upstream, just arriving to Fedora Rawhide. Hopefully glib2 will just call glibc API in the next Linux releases. > and then check that we get the expected strings in that > locale, for the new format strings. Also the test could detect some common bugs and display a message like "the case is incorrect, you have not updated your translation" instead of an enigmatic "the formatted strings do not match". > ::: configure.ac > @@ +1246,3 @@ > + [glib_cv_langinfo_altmon=no])]) > +if test x$glib_cv_langinfo_altmon = xyes; then > + AC_DEFINE(HAVE_LANGINFO_ALTMON,1,[Have nl_langinfo (ALTMON_n)]) > > Not sure this is really needed - afaik, the ALTMON_x values are guaranteeed > to be defines as well, so you could just > check them directly with #ifdef. I don't know but maybe. ALTMON_x has just been added to glibc two days ago and so far it has existed only in BSD family. I'm not even sure that it exists in OS X. Note that configure.ac checks for existence and support of many other constants used by nl_langinfo(). > > ::: glib/gdatetime.c > @@ +216,3 @@ > + * format forms but if nl_langinfo (ALTMON_n) is not supported then we will > + * have use MONTH_FULL as standalone. The same if nl_langinfo () does not > + * exist at all. MONTH_ABBR is similar. */ > > Pet peeve of mine: I prefer the */ on the next line, aligned with the other * OK > @@ +235,2 @@ > case 2: > + return C_("full month name standalone", "February"); > > I think it would be better not to change the context strings here - this > invalidates all existing translations, for not much gain at all. Not to > mention that it blows up the size of the patch... I'm afraid that it may be ambiguous. Should "full month name" be "standalone" (nominative) or "with date" (genitive)? Seems like so far translators have been using the nominative (standalone) case but I'm not sure this is true for all languages. I must rethink this. > @@ +398,3 @@ > + * in a full date context. Here are full month names in a form > + * appropriate when they are used in a full date context, with the > + * day number. */ > > I think it would be really helpful to include an example or two in the > comment here. "full month name with day" > is not really clear to me, and having the default strings be just the same > is not helpful for figuring out what > is meant here. It would require an example in a language which actually requires different cases. Unfortunately, in English (and same in German, French, Spanish, and in about 90% of all languages) there is no difference in the month name when it is used in a date and when it is used standalone.
Review of attachment 367260 [details] [review]: looks ok to me
Created attachment 367935 [details] [review] Fixes the problem (v3) (In reply to Matthias Clasen from comment #13) > Review of attachment 367206 [details] [review] [review]: > > It would be good to add a test for this (probably requires you to add polish > translations first, and then check that we get the expected strings in that > locale, for the new format strings. Done. Is this OK to use the UTF-8 characters in the source code directly, including Cyrillic and Greek? Also I've decided to let the test fail while the translations are not yet complete. Is this a problem? Is this test ran automatically somewhere and prevents building and/or installing? OTOH, I'm afraid we will have to provide the translations very soon, reasons explained below. > ::: configure.ac > @@ +1246,3 @@ > + [glib_cv_langinfo_altmon=no])]) > +if test x$glib_cv_langinfo_altmon = xyes; then > + AC_DEFINE(HAVE_LANGINFO_ALTMON,1,[Have nl_langinfo (ALTMON_n)]) > > Not sure this is really needed - afaik, the ALTMON_x values are guaranteeed > to be defines as well, so you could just > check them directly with #ifdef. Not done, we have agreed offline to leave it as it is. > ::: glib/gdatetime.c > @@ +216,3 @@ > + * format forms but if nl_langinfo (ALTMON_n) is not supported then we will > + * have use MONTH_FULL as standalone. The same if nl_langinfo () does not > + * exist at all. MONTH_ABBR is similar. */ > > Pet peeve of mine: I prefer the */ on the next line, aligned with the other * Done. Also I'm going to apply the same change to the second patch. > @@ +235,2 @@ > case 2: > + return C_("full month name standalone", "February"); > > I think it would be better not to change the context strings here - this > invalidates all existing translations, for not much gain at all. Not to > mention that it blows up the size of the patch... Done. Please verify if this is really what you wanted. Personally I think it's not much useful to preserve the existing translations. First of all, they will be used only for "%OB" and "%Ob" (alternative) formats. The translators will have to provide the new translations for "%B" and "%b" even if in 90% of languages the new translations will be the same as the old ones. Second, if the translation software is smart enough it will mark all month names as fuzzy but hopefully will provide the existing translations as suggestions (for both standalone and full-date forms). All the translators will have to do is to verify and confirm. Also, I'm afraid that we will have to provide the translations for those languages where translators are less active or have left the project because otherwise the most common date formats using "%B" and "%b" will appear in English. Does anyone here have such a superpower? Good news is that these custom translations will be used only on the systems where nl_langinfo() does not provide all forms of month names. They will be completely skipped if compiled with glibc >= 2.27. > @@ +398,3 @@ > + * in a full date context. Here are full month names in a form > + * appropriate when they are used in a full date context, with the > + * day number. */ > > I think it would be really helpful to include an example or two in the > comment here. "full month name with day" > is not really clear to me, and having the default strings be just the same > is not helpful for figuring out what > is meant here. Instead of the examples I have provided the comment that the translators should refer to their date and locale command line utilities. (Hm… does date command line utility reflect this change already?)
Created attachment 367936 [details] [review] Also fixes the problem for Windows (v3) Thank you for accepting the previous version of this patch, Matthias. Despite of this, here is the new version applying the comment style fixes, according to what you have suggested before.
Review of attachment 367935 [details] [review]: ::: configure.ac @@ +1277,3 @@ + +dnl Check for nl_langinfo and ALTMON_n +AC_CACHE_CHECK([for nl_langinfo (ALTMON_n)], glib_cv_langinfo_altmon,[ You need to add some form of this check to the meson build as well....
Created attachment 368283 [details] [review] Fixes the problem (v4) (In reply to Michael Catanzaro from comment #19) > [...] > You need to add some form of this check to the meson build as well.... Done. Is this correct?
Patch "Fixes the problem (v4)" works as expected for Catalan on Fedora Rawhide. glibc & glib2 rpm files: https://copr.fedorainfracloud.org/coprs/rbuj/updates/
Add get_month_name_abbr(gint) & get_month_name(gint) function prototypes in glib/gdatetime.h. Next, define function bodies in glib/gdatetime.c using #ifdef... then you can use it in outer functions like "gchar* gcal_get_month_name (gint i)" [1], or remove them. [1] https://github.com/GNOME/gnome-calendar/blob/master/src/gcal-utils.c
gnome-calendar and panel displaying an incorrect standalone month "d'abril" instead of the correct one "abril" for Catalan. https://www.dropbox.com/s/0871wmjz4hldjdz/GNOME%20classic.png
(In reply to Robert Buj from comment #22) > Add get_month_name_abbr(gint) & get_month_name(gint) function prototypes in > glib/gdatetime.h. [...] We have already ~3 ways to provide month names. Do we need another set of functions to provide the same facility? Even if yes this is beyond the scope of this bug report. > #ifdef... then you can use it in outer functions like "gchar* > gcal_get_month_name (gint i)" [1], or remove them. > > [1] https://github.com/GNOME/gnome-calendar/blob/master/src/gcal-utils.c (In reply to Robert Buj from comment #23) > gnome-calendar Bug 780745, now migrated to https://gitlab.gnome.org/GNOME/gnome-calendar/issues/125, waiting for merge. > and panel displaying an incorrect standalone month "d'abril" > instead of the correct one "abril" for Catalan. > > https://www.dropbox.com/s/0871wmjz4hldjdz/GNOME%20classic.png Bug 780957, now fixed upstream as https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/21, waiting for translation updates. :-)
Review of attachment 368283 [details] [review]: Ok
Review of attachment 368283 [details] [review]: Looks great to me too. I’ve got a couple of nitpicks, but I will fix them before pushing. ::: glib/tests/gdatetime.c @@ +1555,3 @@ + gchar *oldlocale; + + oldlocale = g_strdup (setlocale (LC_ALL, NULL)); Add `g_test_bug ("749206");` before the setlocale() call. @@ +1689,3 @@ + g_test_incomplete ("locale ru_RU not available, skipping Russian month names test"); + + Spurious double empty line.
Comment on attachment 368283 [details] [review] Fixes the problem (v4) I have pushed the g_date_time_format() patch to git master for the 2.56 release, thanks. What is the status of the Windows patch?
(In reply to Philip Withnall from comment #27) > [...] > I have pushed the g_date_time_format() patch to git master for the 2.56 > release, thanks. Thank you. What about the release team approval? Well, I assume they don't mind. IMO this change was absolutely necessary because of the recent changes in glibc. Without this we would be unable to display month names in nominative case (in those languages which need two grammatical cases). > What is the status of the Windows patch? It was reviewed previously (comment 16) but I have provided a new version with minor improvements (comment 18) so it has lost its "accepted-commit_now" status. Please review again, hopefully you can also commit. This will close this bug report. Thank you for your cooperation.
(In reply to Rafal Luzynski from comment #28) > (In reply to Philip Withnall from comment #27) > > [...] > > I have pushed the g_date_time_format() patch to git master for the 2.56 > > release, thanks. > > Thank you. What about the release team approval? String freeze is open until 2018-02-19: https://wiki.gnome.org/Schedule. > > What is the status of the Windows patch? > > It was reviewed previously (comment 16) but I have provided a new version > with minor improvements (comment 18) so it has lost its > "accepted-commit_now" status. Please review again, hopefully you can also > commit. This will close this bug report. Thank you for your cooperation. I’ll review it shortly, thanks.
Review of attachment 367936 [details] [review]: Could you add some tests to glib/tests/date.c please? Let them run on every platform — on Linux this will just test strftime(), but we want to know that it has the same behaviour as win32_strftime_helper(). ::: glib/gdate.c @@ +2178,3 @@ + memmove (((wchar_t *) result->data) + result->len - n, + ((wchar_t *) result->data) + result->len - n + 2, + (n - 2) * sizeof (wchar_t)); What’s the memmove() needed for? That definitely needs an explanatory comment. @@ +2204,3 @@ + (n - 2) * sizeof (wchar_t)); + g_array_set_size (result, result->len - 3); + } The code for 'B' and 'b'/'h' is very similar. Could you factor it out into a helper function? Probably best done as two commits: one which factors out the existing duplicated code, and a second which adds the nominative/genitive support.
Created attachment 368447 [details] [review] g_date_set_parse: Add nominative/genitive month names I've just realized that we need this as well. This patch tries to find both nominative ("%OB", "%Ob") and genitive ("%B", "%b") month names in the parsed string.
Created attachment 368451 [details] [review] win32_strftime_helper: Factor out the common code (In reply to Philip Withnall from comment #30) > [...] > The code for 'B' and 'b'/'h' is very similar. Could you factor it out into a > helper function? Probably best done as two commits: one which factors out > the existing duplicated code, and a second which adds the > nominative/genitive support. Done (part 1: factor out the common code)
Created attachment 368452 [details] [review] Also fixes the problem for Windows (v4) New version, applies the suggestions.
(In reply to Philip Withnall from comment #30) > Review of attachment 367936 [details] [review] [review]: > > Could you add some tests to glib/tests/date.c please? Thanks for the suggestion. I wrote a test and I've just discovered a bug which is not a part of this bug, it has existed since forever. In g_date_fill_parse_tokens() there is a loop over all months, for each month it checks if the full name of this month is a substring of the parsed string, if not checks the short name of the month in the same iteration. Now I'm also adding alternative month names but it is unrelated. As a test case I set up Polish locale and parse the date containing "wrzesień" (September). The loop iterates over all months: January, February,... and for each checks if the month name (full or abbreviated) is a substring of this string. And here we have found it: "sie" (abbreviated "sierpień" - August) is a substring of "wrze_sie_ń". So, the month "wrzesień" (September) has been recognized as "sie" (Aug).
Created attachment 368487 [details] [review] Add tests for formatting and parsing month names (In reply to Philip Withnall from comment #30) > Review of attachment 367936 [details] [review] [review]: > > Could you add some tests to glib/tests/date.c please? Let them run on every > platform — on Linux this will just test strftime(), but we want to know that > it has the same behaviour as win32_strftime_helper(). Done. As you will see it fails parsing September in Polish ("wrzesień") but this is not a new bug, it has been around for ~20 years.
Review of attachment 368451 [details] [review]: Lovely, thanks.
Review of attachment 368452 [details] [review]: Looks good, thanks.
Review of attachment 368487 [details] [review]: Great, thanks.
Review of attachment 368447 [details] [review]: ::: glib/gdate.c @@ +1001,3 @@ } + + /* Here month names may be in a genitive case. Should that be s/genitive/nominative/?
(In reply to Philip Withnall from comment #39) > Review of attachment 368447 [details] [review] [review]: > > ::: glib/gdate.c > @@ +1001,3 @@ > } > + > + /* Here month names may be in a genitive case. > > Should that be s/genitive/nominative/? Yes, you are absolutely right: nominative. Also not "may be" but "will be": there is no other choice for this place than a nominative case. Please reword this for me, my ability to contribute to open source projects is temporarily limited. Also please note the bug 793550. I thought that other bug would be fixed first and only then this patch would be reworked and applied. But TBH there is nothing wrong with this patch, it does not cause the bug 793550 which can be also fixed later.
(In reply to Rafal Luzynski from comment #40) > (In reply to Philip Withnall from comment #39) > > Review of attachment 368447 [details] [review] [review] [review]: > > > > ::: glib/gdate.c > > @@ +1001,3 @@ > > } > > + > > + /* Here month names may be in a genitive case. > > > > Should that be s/genitive/nominative/? > > Yes, you are absolutely right: nominative. Also not "may be" but "will be": > there is no other choice for this place than a nominative case. Please > reword this for me, my ability to contribute to open source projects is > temporarily limited. Thanks, I’ll modify the patch and push it (and the others). > Also please note the bug 793550. I thought that other bug would be fixed > first and only then this patch would be reworked and applied. But TBH there > is nothing wrong with this patch, it does not cause the bug 793550 which can > be also fixed later. I’m unlikely to find time to fix that one at the moment, but since there’s no regression I don’t count it as a high priority.
(In reply to Philip Withnall from comment #41) > [...] > Thanks, I’ll modify the patch and push it (and the others). Yes, please, thank you. > (In reply to Rafal Luzynski from comment #40) > > Also please note the bug 793550. I thought that other bug would be fixed > > first and only then this patch would be reworked and applied. But TBH there > > is nothing wrong with this patch, it does not cause the bug 793550 which can > > be also fixed later. > > I’m unlikely to find time to fix that one at the moment, but since there’s > no regression I don’t count it as a high priority. True, no regression, these patches are urgent because they align glib2 with glibc and there is no rush with bug 793550, its impact is really low and I will fix it later.
Comment on attachment 368487 [details] [review] Add tests for formatting and parsing month names All pushed to master, including the change to the comments mentioned in comment #41, and a follow-up commit which disables the /date/month_names test for older libc versions which don’t support %OB. 493d3438e (HEAD -> master, origin/master, origin/HEAD) tests: Disable /date/month_names test with older libc versions 998bda1b5 po: Update en_GB translation 359768369 Add tests for formatting and parsing month names fa1265880 win32_strftime_helper: Support nominative/genitive month names 5520a8793 win32_strftime_helper: Factor out the common code 915224482 g_date_set_parse: Also support nominative/genitive month names
I think this is all done then. Let’s follow up in bug #793578 and bug #793550.
I reopen the bug for some minor tweaks, as we have agreed off list.
Created attachment 369093 [details] [review] date: Amend some comments This patch amends some comments: explains that the month names are in a genitive case only if the language rules require this. Also it explains that although the test for _NL_ABALTMON_* being supported is not the same as the test for %OB/%Ob/%Oh, it is sufficient for our needs.
Review of attachment 369093 [details] [review]: Sure, thanks.
Pushed to master.