GNOME Bugzilla – Bug 793645
test_month_names: Updated translations needed for el_GR, hr_HR, ru_RU
Last modified: 2018-03-16 23:31:59 UTC
ERROR: gdatetime - Bail out! GLib:ERROR:../glib/glib/tests/gdatetime.c:1596:test_month_names: assertion failed (p == ("enero")): ("Enero" == "enero") Failing code: setlocale (LC_ALL, "es_ES.utf-8"); if (strstr (setlocale (LC_ALL, NULL), "es_ES") != NULL) { TEST_PRINTF_DATE (2018, 1, 1, "%B", "enero"); TEST_PRINTF_DATE (2018, 2, 1, "%OB", "febrero"); TEST_PRINTF_DATE (2018, 3, 1, "%b", "mar"); TEST_PRINTF_DATE (2018, 4, 1, "%Ob", "abr"); TEST_PRINTF_DATE (2018, 5, 1, "%h", "may"); TEST_PRINTF_DATE (2018, 6, 1, "%Oh", "jun"); }
Daniel, is there any reason the es_ES translations of month names in GLib have different capitalisation to what’s in glibc? https://lh.2xlibre.net/locale/es_ES/#value-56 glibc has them all in lower case (’enero’ rather than ‘Enero’). I thought Spanish didn’t have proper nouns like English, so lower case would be more correct anyway? These month names are used when building dates, so won’t necessarily appear at the start of a sentence.
Daniel replied by e-mail due to Bugzilla problems: > I translated month names with that capitalization because of the the original string was, and we use to respect the original format. If there is no need to do so maybe a translator's comment would be useful, as other languages may fall in the same issue. He’s pushed a fix as commit d1a080baa5f4f94fc802fce611445c2558ac0a18. This should be fixed now. There is already a translator comment in place for the nominative and genitive long and short forms, although xgettext will associate it only with the string for January. Daniel, if that comment is not sufficient, please re-open this bug and let me know what you think could be clarified/improved about it. Thanks!
Mmmm, now it works... maybe it was an isolated problem. Well, I don't see any comment indicating the month name should be in lowercase, so it would be useful to specify it if needed. Thanks!
(In reply to Daniel Mustieles from comment #3) > Well, I don't see any comment indicating the month name should be in > lowercase, so it would be useful to specify it if needed. From https://gitlab.gnome.org/GNOME/glib/blob/master/glib/gdatetime.c#L235: > Some other languages may prefer starting with uppercase when > they are standalone and with lowercase when they are in a complete > date context. and > you can refer to the date command > line utility and see what the command `date +%OB' produces. I think that’s fairly prescriptive about case and examples.
Thanks for looking into this! Now I have failure for: TEST_PRINTF_DATE (2018, 3, 1, "%b", "mar"); which fails with: ERROR: gdatetime - Bail out! GLib:ERROR:../glib/glib/tests/gdatetime.c:1598:test_month_names: assertion failed (p == ("mar")): ("Mar" == "mar")
Ok, but note that date command is wrong... it should be 'date +%0B' (zero instead of O)
Also fixed short month names in git ;-)
Still the same problem as in https://bugzilla.gnome.org/show_bug.cgi?id=793645#c5
It seem month names are duplicated in the PO file (different context) and I fixed the first ones... fixed (again) ;-)
(In reply to Daniel Mustieles from comment #6) > Ok, but note that date command is wrong... it should be 'date +%0B' (zero > instead of O) No, that command is right. The ‘O’ (not ‘0’) modifier is a new one introduced in the February release of glibc, which allows you to get a nominative form of the month. It’s needed for certain languages (generally not Western European ones though). See the translator comments. In commit c26c7e47e60eec986fb3cddc03d35a646eaee744, did you mean to change ‘Abr’ to ‘mbr’? Shouldn’t it be ‘abr’? (In reply to Daniel Mustieles from comment #9) > It seem month names are duplicated in the PO file (different context) and I > fixed the first ones... Yes, that’s the whole point here. The month names now have two contexts: one for the nominative form, and one for the genitive form. For most Western European languages, the translations will be the same. For Russian (for example) they will differ.
(In reply to Philip Withnall from comment #10) > (In reply to Daniel Mustieles from comment #6) > > Ok, but note that date command is wrong... it should be 'date +%0B' (zero > > instead of O) > > No, that command is right. The ‘O’ (not ‘0’) modifier is a new one > introduced in the February release of glibc, which allows you to get a > nominative form of the month. It’s needed for certain languages (generally > not Western European ones though). See the translator comments. Ah ok... I tried it and got an error. Didn't know it was a new option (Debian stable still hasn't it) > > In commit c26c7e47e60eec986fb3cddc03d35a646eaee744, did you mean to change > ‘Abr’ to ‘mbr’? Shouldn’t it be ‘abr’? Sure, fixed > > (In reply to Daniel Mustieles from comment #9) > > It seem month names are duplicated in the PO file (different context) and I > > fixed the first ones... > > Yes, that’s the whole point here. The month names now have two contexts: one > for the nominative form, and one for the genitive form. For most Western > European languages, the translations will be the same. For Russian (for > example) they will differ.
The problem from Comment 5 still reproduces.
I’ve pushed a complete fix for es.po, and verified that the unit test now passes for that locale. However, it still fails for the other locales which haven’t been updated yet (fr_FR, el_GR, hr_HR, lt_LT, ru_RU). I’m not going to update those locales myself, since I’m not familiar with the languages (especially those, like el_GR and ru_RU, where there *are* nominative/genitive differences). I’ll e-mail the gnome-i18n list shortly and ask people to update the translations. For the moment, I’ll leave the test failing and this bug open, to remind us about what’s left to do. If we get towards hard code freeze and the test is still failing, I’ll copy the translations across for the remaining locales, and users of that locale will have to live with potentially-incorrect nominative/genitive differences. commit 6f16176462a3721ba738f6441d7276202692f13b (HEAD -> master, origin/master, origin/HEAD) Author: Philip Withnall <withnall@endlessm.com> Date: Wed Feb 21 11:32:49 2018 +0000 po: Fix Spanish abbreviated month names The case was wrong, and ‘mayo’ had not been abbreviated to ‘may’. The new translated strings here have all been copied from the abbreviated month names (without days). Signed-off-by: Philip Withnall <withnall@endlessm.com> https://bugzilla.gnome.org/show_bug.cgi?id=793645#c5 po/es.po | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
Daniel, if you could review my changes to es.po (commit https://git.gnome.org/browse/glib/commit/?id=6f16176462a3721ba738f6441d7276202692f13b) and let me know if anything’s incorrect, that would be great. Otherwise, I think everything is done for Spanish here, thanks.
It's ok, thanks for taking care of this! :-)
(In reply to Philip Withnall from comment #1) > Daniel, is there any reason the es_ES translations of month names in GLib > have different capitalisation to what’s in glibc? > > https://lh.2xlibre.net/locale/es_ES/#value-56 > > glibc has them all in lower case (’enero’ rather than ‘Enero’). I thought > Spanish didn’t have proper nouns like English, so lower case would be more > correct anyway? > > These month names are used when building dates, so won’t necessarily appear > at the start of a sentence. CLDR (Unicode Common Locale Data Repository) suggests that for the languages which don't need the nominative/genitive case difference in dates the %B/%OB feature can be used to provide the locale data for %OB starting with upppercase and for %B starting with lowercase. That's because the nominative case (as generated by %OB) is mainly used to list the month names standalone, e.g., in a calendar header. Usually in this case the month name must/should/could start with an uppercase even if normally it starts with a lowercase (in the middle of a sentence, a date, etc.) There is nothing wrong with providing different capitalisation than glibc although this must be explained. Maybe glibc is wrong? - CLDR actually provides lowercase/uppercase for Spanish but AFAIR inly for Spanish/Uruguay and Spanish/Peru. Should this be a rule for all variants of Spanish or just those two? - Do Spanish people want the same in glibc as well? I have patches ready and waiting for a review, I only didn't have enough time to approach every local community and ask about their opinion. - If the tests fail because I made wrong assumptions about the contents of the locale data then please fix the tests. (In reply to Philip Withnall from comment #13) > [...] > I’m not going to update those locales myself, since I’m not familiar with > the languages (especially those, like el_GR and ru_RU, where there *are* > nominative/genitive differences). I’ll e-mail the gnome-i18n list shortly > and ask people to update the translations. You can: - retrieve the data from the latest glibc, - retrieve them from CLDR, - ask me (although I'd really rather the native translators provide their fixes instead).
I’ve prepared a branch that updates month name translations: https://gitlab.gnome.org/GNOME/glib/commits/wip/piotrdrag/missing-months I skipped languages that were already missing some month translations, as they were incomplete anyway. I’m sure I made some mistakes during the extensive copy-pasting, so review will be needed. We can mix’n’match, squash and stretch commits as needed, and commit what we want before the final 2.56 release. String freeze has just started, so I expect we will see more translation updates from translators in the coming weeks.
(In reply to Piotr Drąg from comment #17) > I’ve prepared a branch that updates month name translations: > > https://gitlab.gnome.org/GNOME/glib/commits/wip/piotrdrag/missing-months Thanks Piotr. I’ll push that between 2018-03-05 and 2018-03-12 if any languages remain un-updated.
(In reply to Piotr Drąg from comment #17) > I’ve prepared a branch that updates month name translations: > > https://gitlab.gnome.org/GNOME/glib/commits/wip/piotrdrag/missing-months I think that in case of Czech and Slovak language we should ask the language community about their opinion. I asked Czech people and their answers were mixed but most of them said that although the genitive case exists in their language it is wrong to use it in dates (or maybe I misunderstood, maybe they meant it's not wrong to use the nominative case). Czech Wikipedia uses the nominative case in dates. In case of Slovak I did not ask anybody but looking at Wikipedia it seems to follow the same rule. Another language with the same rule is Serbian, they have already provided their translation and use only the nominative case. Scottish Gaelic translation is most probably wrong. According to CLDR, the genitive case of January is "dhen Fhaoilleach" (not "dhen Faoilleach"), February is "dhen Ghearran" (not "dhen Gearran") and so on. But, again, I'd like the Scottish Gaelic people to review this. For two reasons: to proofread and to tell if they really want the genitive case. There are two more languages which *probably* need a genitive case but I will not dare to touch them: Armenian and Farsi (Persian). For these reasons glibc provides the genitive case data for only 7 languages now while about 20 were prepared and ready to commit.
(In reply to Rafal Luzynski from comment #19) > I think that in case of Czech and Slovak language we should ask the language > community about their opinion. I asked Czech people and their answers were > mixed but most of them said that although the genitive case exists in their > language it is wrong to use it in dates (or maybe I misunderstood, maybe > they meant it's not wrong to use the nominative case). Czech Wikipedia uses > the nominative case in dates. In case of Slovak I did not ask anybody but > looking at Wikipedia it seems to follow the same rule. Another language with > the same rule is Serbian, they have already provided their translation and > use only the nominative case. > The Czech team seems to agree with CLDR: https://gitlab.gnome.org/GNOME/glib/commit/4e8a4d0d572fa37866f9e7b486b01acf0849acc0 Serbian, unlike Czech and Slovak, has the same case for both formatting and standalone months in CLDR. > Scottish Gaelic translation is most probably wrong. According to CLDR, the > genitive case of January is "dhen Fhaoilleach" (not "dhen Faoilleach"), > February is "dhen Ghearran" (not "dhen Gearran") and so on. But, again, I'd > like the Scottish Gaelic people to review this. For two reasons: to > proofread and to tell if they really want the genitive case. > I made a mistake and just replaced pronouns. Should be fixed now, and conform to CLDR. > There are two more languages which *probably* need a genitive case but I > will not dare to touch them: Armenian and Farsi (Persian). > Farsi in CLDR doesn’t use different cases. Armenian does, but its GLib translation w.r.t. month names is incomplete anyway. > For these reasons glibc provides the genitive case data for only 7 languages > now while about 20 were prepared and ready to commit. Changing GLib translations is an order of magnitude easier than changing glibc locales :)
(In reply to Philip Withnall from comment #18) > (In reply to Piotr Drąg from comment #17) > > I’ve prepared a branch that updates month name translations: > > > > https://gitlab.gnome.org/GNOME/glib/commits/wip/piotrdrag/missing-months > > Thanks Piotr. I’ll push that between 2018-03-05 and 2018-03-12 if any > languages remain un-updated. I’ve just merged and pushed the branch, and will e-mail the gnome-i18n list about it shortly. Piotr, the tests still fail because the el.po changes on your branch don’t match what Rafal put into the tests for July and August: the tests expect the abbreviated translations to be Ιούλ (%b) and Αύγ (%Ob) respectively. Should we be updating the tests, or el.po, here?
Similarly for hr.po: the tests say the %b translation for November should be ‘Stu’, but hr.po has ‘stu’. And for lt.po, the tests say the %OB translation for April should be ‘balandis’, but lt.po has ‘balandžio’. Similarly for several other of the lt.po translations. Aside from those issues, the tests now pass (including ja.po, bug #793578).
(In reply to Philip Withnall from comment #21) > Piotr, the tests still fail because the el.po changes on your branch don’t > match what Rafal put into the tests for July and August: the tests expect > the abbreviated translations to be Ιούλ (%b) and Αύγ (%Ob) respectively. > > Should we be updating the tests, or el.po, here? Some %Ob translations were wrong, which I fixed: https://gitlab.gnome.org/GNOME/glib/commit/b03c37cf3bec82a0164ab1e0b84adac3cf05c1c1 “16 Ιούλ 2018” in the tests seems wrong to me (“Ιούλ” is %Ob according to CLDR), but I’ll leave that to Rafal and you. (In reply to Philip Withnall from comment #22) > Similarly for hr.po: the tests say the %b translation for November should be > ‘Stu’, but hr.po has ‘stu’. > IMHO the test for %b should be changed. > And for lt.po, the tests say the %OB translation for April should be > ‘balandis’, but lt.po has ‘balandžio’. Similarly for several other of the > lt.po translations. > Lithuanian was done by the Lithuanian translator. :) %OB is “Balandis”, not “balandžio”. The difference from %B is capitalization. I’m adding Aurimas to CC to help us with this one.
In Lithuanian we do not capitalize when in full date context and we have a different grammatical form. In case of April we use: - 'Balandis' for standalone - 'balandžio' in full context (2015 balandžio 5). Do read the comments correctly? %0B is for standalone, while %B is in full date context? If so, then translations look correct to me.
Yes, but %OB is used also for month + year, e.g. in these three lines: https://gitlab.gnome.org/GNOME/glib/blob/master/glib/tests/date.c#L339 The tests have “2018 m. balandis”, but the current translation gives us “2018 m. Balandis”. glibc, on the other hand, has both in lower case (https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/lt_LT#l210), which is consistent with CLDR (http://www.unicode.org/cldr/charts/31/summary/lt.html#1872).
(In reply to Piotr Drąg from comment #25) > Yes, but %OB is used also for month + year, e.g. in these three lines: > https://gitlab.gnome.org/GNOME/glib/blob/master/glib/tests/date.c#L339 I see. Pushed changes with all of those lowercased plus some more corections for the short versions.
I believe now these two fail: TEST_DATE (17, 7, 2018, "%Y m. %b %e d.", "2018 m. Lie 17 d."); TEST_DATE ( 1, 8, 2018, "%Y m. %Ob", "2018 m. Rgp"); should be 'liep.' and 'rugp.' (with dots). Should I push a change for these?
(In reply to Piotr Drąg from comment #23) > (In reply to Philip Withnall from comment #21) > > Piotr, the tests still fail because the el.po changes on your branch don’t > > match what Rafal put into the tests for July and August: the tests expect > > the abbreviated translations to be Ιούλ (%b) and Αύγ (%Ob) respectively. > > > > Should we be updating the tests, or el.po, here? > > Some %Ob translations were wrong, which I fixed: > https://gitlab.gnome.org/GNOME/glib/commit/ > b03c37cf3bec82a0164ab1e0b84adac3cf05c1c1 > > “16 Ιούλ 2018” in the tests seems wrong to me (“Ιούλ” is %Ob according to > CLDR), but I’ll leave that to Rafal and you. The authoritative source is here: http://st.unicode.org/cldr-apps/v#/el/Gregorian/ and indeed, there is a difference between the abbreviated nominative and the abbreviated genitive form. I really don't know why and I don't know why I did not notice it before. Possibly the contents of CLDR has been updated after I imported it to glibc. > > (In reply to Philip Withnall from comment #22) > > Similarly for hr.po: the tests say the %b translation for November should be > > ‘Stu’, but hr.po has ‘stu’. > > > > IMHO the test for %b should be changed. I'd like to hear this from a Croatian translator.
(In reply to Aurimas Černius from comment #27) > I believe now these two fail: > TEST_DATE (17, 7, 2018, "%Y m. %b %e d.", "2018 m. Lie 17 d."); > TEST_DATE ( 1, 8, 2018, "%Y m. %Ob", "2018 m. Rgp"); > > should be 'liep.' and 'rugp.' (with dots). Aurimas, please verify if these resources and commits are correct: http://st.unicode.org/cldr-apps/v#/lt/Gregorian/ https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/lt_LT;hb=HEAD https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=8b406f8 > Should I push a change for these? I suggest not to push changes which are different than glibc. That's because GLib falls back to the underlying libc whenever it is available and whenever it provides the correct data. The translations which you are providing now will be used only on the systems which do not provide the locale data (or provide them incompletely). If you fix the data here in GLib then the tests will pass on some systems and will fail on other systems. If you think that "Lie" and "Rgp" are wrong and should be replaced with "liep." and "rugp." respectively then file a bug report in glibc: https://sourceware.org/bugzilla/enter_bug.cgi?product=glibc and request the change. Or just simply let me know. :) Note that if a change is added to glibc then it is visible everywhere, not just in GNOME.
(In reply to Rafal Luzynski from comment #29) > (In reply to Aurimas Černius from comment #27) > Aurimas, please verify if these resources and commits are correct: > > http://st.unicode.org/cldr-apps/v#/lt/Gregorian/ > > https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/lt_LT; > hb=HEAD > > https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=8b406f8 The second one uses the 3-letter abbreviations, which are OK-ish, but the ones from CLDR ('saus.', 'vas.', ...) are actually the proper ones. > If you think that "Lie" and "Rgp" are wrong and should be replaced with > "liep." and "rugp." respectively then file a bug report in glibc: > > https://sourceware.org/bugzilla/enter_bug.cgi?product=glibc > > and request the change. Or just simply let me know. :) Note that if a change > is added to glibc then it is visible everywhere, not just in GNOME. Yup, doing such change would be nice, if possible.
Filed upstream: https://sourceware.org/bugzilla/show_bug.cgi?id=22932 (Lithuanian)
Summarizing again: (In reply to Rafal Luzynski from comment #28) > (In reply to Piotr Drąg from comment #23) > > [...] > > Some %Ob translations were wrong, which I fixed: > > https://gitlab.gnome.org/GNOME/glib/commit/ > > b03c37cf3bec82a0164ab1e0b84adac3cf05c1c1 > > > > “16 Ιούλ 2018” in the tests seems wrong to me (“Ιούλ” is %Ob according to > > CLDR), but I’ll leave that to Rafal and you. > > The authoritative source is here: > http://st.unicode.org/cldr-apps/v#/el/Gregorian/ and indeed, there is a > difference between the abbreviated nominative and the abbreviated genitive > form. I really don't know why and I don't know why I did not notice it > before. Possibly the contents of CLDR has been updated after I imported it > to glibc. Filed against glibc, consulted with a native speaker and I will prepare a patch shortly: https://sourceware.org/bugzilla/show_bug.cgi?id=22937 > > (In reply to Philip Withnall from comment #22) > > > Similarly for hr.po: the tests say the %b translation for November should be > > > ‘Stu’, but hr.po has ‘stu’. > > > > > > > IMHO the test for %b should be changed. > > I'd like to hear this from a Croatian translator. I have consulted this with a Croatian translator and indeed, they basically need all month and weekday names to start with a lowercase. But this is what glibc currently provides. What needs to be fixed is a GLib translation. However, as translators may have good reason to start the month names with an uppercase I'd like to give them more freedom instead: relax the test criteria and compare the month names case insensitively. Thank you, Piotr. Please leave your recent changes to *.po files, glibc will be adapted to them. :-) Also the tests will be updated.
Created attachment 369594 [details] [review] Update month names (Greek and Lithuanian) This patch updates the month names according to the problems spotted by Philip Withnall in comment 21 (Greek) and by Aurimas Černius in comment 27 (Lithuanian). The glibc locale data still contain the old values so the tests may fail with glibc 2.27 but I hope to push the changes this week.
Created attachment 369595 [details] [review] Compare the month names case-insensitively This patch directly addresses the problems spotted by Tomasz Miąsko in comment 0 and in comment 5 and by Philip Withnall in comment 22. I wish those two patches landed in 2.56 branch which has been released just now.
Review of attachment 369594 [details] [review]: OK.
Review of attachment 369595 [details] [review]: Fine.
I’ve pushed those two patches to master, so they should be in the 2.56.1 release. I can’t quite get the tests working (still having some problems with lt.po), but the translations should match what’s in the tests, so I’m not quite sure where the problem is coming from, and am willing to put it down to a stale file somewhere on my machine. Rafal, is there anything more you think needs doing here?
(In reply to Philip Withnall from comment #37) > I’ve pushed those two patches to master, so they should be in the 2.56.1 > release. Thank you. > I can’t quite get the tests working (still having some problems > with lt.po), but the translations should match what’s in the tests, so I’m > not quite sure where the problem is coming from, and am willing to put it > down to a stale file somewhere on my machine. I guess the lt_LT test now says that it should be "rugp." but the result is "Rgp", is that true? The failure may depend on the version of glibc you are using and whether you are on Linux or Windows or another system. The translations from lt.po are used only if libc does not provide its own translations. So probably your system needs a fix of glibc which I have mentioned in comment 31 but it is not yet fixed. > Rafal, is there anything more you think needs doing here? Nothing more can be done here so you can either close it now or (my suggestion) keep it open as a tracker for glibc bugs mentioned here.
Upstream glibc bugs have been fixed, including those fixing lt_LT month names [1] and adding abbreviated nominative/genitive month names to el_* [2], both in master (future 2.28) and stable 2.27 branch. Distributions are free to apply the upstream patches or wait for 2.27.1 release, if there will be any. There is nothing more we can do here. Happy testing and happy using! [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22932 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=22937