GNOME Bugzilla – Bug 619899
Use normal gettext or intltool toolchain also for scm files
Last modified: 2018-06-29 22:39:54 UTC
xgettext.scm should also collect comments for translators. While in other languages the translator comments work fine in scheme they are ignored. See e.g. taxtxf.scm, which has a bunch of comments - but none arrives in the .po files. intltool/NEWS says: Version 0.32 : * Fix Scheme string extraction, add support for translators' comments -- bug #137029 (Danilo Segan) Eventually we should test that?
xgettext --add-comments works fine for me on .scm files. So I would suggest to substitute xgettext.scm by it. The question is, should we check for libtool version >= 0.32? Could somebody help me here, as I am not firm in all this automagic things?
Errr... which one is your request? Replacing our hand-written intl-scm/xgettext.scm tool with the "normal" gettext or intltool toolchain? Or extending our hand-written tool? As for replacing: We have never been able to extract translation comments from scm files, so this is not and never has been the issue here. Instead, there was some issue that xgettext and intltool didn't extract all that we wanted from the scm files, and because of this we stuck to our hand-written extraction tool. I checked some time ago (probably when someone from i18n-gnome team asked on gnucash-devel) whether intltool could extract all that we wanted from scm, and at that time it did *not* fulfil our requirements, but I forgot the details (maybe it lost the reference to the source code file?). That was probably 1-2 years ago. This might have changed, though. If you can confirm that intltool extracts all current strings from scm and still has the reference to the source code file, I would completely agree to a switch-over. The source code *line* and the translator comments would both be nice add-ons. We currently don't have neither in gnucash for the scm files. As for extending our hand-written tool: That's very difficult. Due to the nature of the scheme programming language, it is rather easy to write the extraction of the string literals, but it is very hard to additionally read all comments. I wouldn't recommend anyone trying this.
My research result: 1. in makefile - SCMFILES = $(shell find ${abs_top_srcdir}/src -name test -prune -o -name '*.scm' -print ) + SCMFILES = $(shell find ${abs_top_srcdir}/src -name-type l -prune test -prune -o -name '*.scm' -print ) would avoid doubled entries by symlinked directories ./gnucash[/report/..]/.. 2. running make with a few adjustments to produce a po file comparable with guile-strings.c: SCMFILES = $(shell find ../src -name test -prune -o -name '*.scm' -print ) : guile-strings.c: $(SCMFILES) : && $(XGETTEXT) --output-dir=$(abs_builddir) --add-comments --keyword=Q_ --keyword=N_ --keyword=_ --flag=_:1:pass-scheme-format $(SCMFILES) This options are a combination from intltools example, man page and our old setting. I ran make pot, wich aborted after producing intl-scm/messages.po Then I compared guile-strings.c with intl-scm/messages.po: > grep -c "/* src" intl-scm/guile-strings.c 4899 > grep -c "#: ../src/" 'intl-scm/messages.po' 4851 ð = 48, but there are two simple explainations: a. In long lists xgettext writes 2 references per line: #: ../src/app-utils/prefs.scm:68 ../src/app-utils/prefs.scm:76 #: ../src/app-utils/prefs.scm:69 ../src/app-utils/prefs.scm:87 #: ../src/app-utils/prefs.scm:95 ../src/app-utils/prefs.scm:96 #: ../src/app-utils/prefs.scm:70 ../src/app-utils/prefs.scm:78 #: ../src/app-utils/prefs.scm:79 ../src/app-utils/prefs.scm:86 #: ../src/app-utils/prefs.scm:71 ../src/app-utils/prefs.scm:72 #: ../src/app-utils/prefs.scm:74 ../src/app-utils/prefs.scm:85 #: ../src/app-utils/prefs.scm:88 ../src/app-utils/prefs.scm:89 #: ../src/scm/price-quotes.scm:628 ../src/scm/price-quotes.scm:629 #: ../src/scm/price-quotes.scm:633 ../src/scm/price-quotes.scm:634 #: ../src/scm/price-quotes.scm:639 ../src/scm/price-quotes.scm:641 #: ../src/scm/price-quotes.scm:647 ../src/scm/price-quotes.scm:648 #: ../src/scm/price-quotes.scm:653 ../src/scm/price-quotes.scm:654 #: ../src/scm/price-quotes.scm:657 ../src/scm/price-quotes.scm:660 #: ../src/scm/price-quotes.scm:670 ../src/scm/price-quotes.scm:681 #: ../src/scm/price-quotes.scm:675 ../src/scm/gnucash/price-quotes.scm:675 #: ../src/scm/price-quotes.scm:689 ../src/scm/gnucash/price-quotes.scm:670 #: ../src/scm/price-quotes.scm:694 ../src/scm/gnucash/price-quotes.scm:694 #: ../src/scm/price-quotes.scm:709 ../src/scm/price-quotes.scm:718 #: ../src/scm/price-quotes.scm:714 ../src/scm/gnucash/price-quotes.scm:714 #: ../src/scm/price-quotes.scm:723 ../src/scm/gnucash/price-quotes.scm:723 #: ../src/tax/us/de_DE.scm:27 ../src/tax/us/gnucash/tax/de_DE.scm:27 #: ../src/tax/us/de_DE.scm:28 ../src/tax/us/gnucash/tax/de_DE.scm:28 #: ../src/tax/us/txf.scm:75 ../src/tax/us/txf-de_DE.scm:75 -- 8 + 16 = 24 b. Found differences: douplicate entry in one line #: ../src/report/locale-specific/us/[gnucash/report/]taxtxf.scm:152 #: ../src/report/locale-specific/us/[gnucash/report/]taxtxf-de_DE.scm:146 146: (list 'last-year (N_ "Last Year") (N_ "Last Year"))) xgettext: makes 1 entry, xgettext.scm 2 entries -- 2*2 = 4 #: ../src/report/report-system/options-utilities.scm 50: (list (vector 'DayDelta (N_ "Day") (N_ "Day")) 51: (vector 'WeekDelta (N_ "Week") (N_ "Week")) 53: (vector 'MonthDelta (N_ "Month") (N_ "Month")) 54: (vector 'QuarterDelta (N_ "Quarter") (N_ "Quarter")) 55: (vector 'HalfYearDelta (N_ "Half Year") (N_ "Half Year")) 56: (vector 'YearDelta (N_ "Year") (N_ "Year")) 220: (vector 'circle (N_ "Circle") (N_ "Circle")) 221: (vector 'cross (N_ "Cross") (N_ "Cross")) 222: (vector 'square (N_ "Square") (N_ "Square")) 223: (vector 'asterisk (N_ "Asterisk") (N_ "Asterisk")) #: ../src/report/standard-reports/gnucash/[report/standard-reports/]transaction.scm 741: (vector 'none (N_ "None") (N_ "None")) 742: (vector 'weekly (N_ "Weekly") (N_ "Weekly")) 743: (vector 'monthly (N_ "Monthly") (N_ "Monthly")) 744: (vector 'quarterly (N_ "Quarterly") (N_ "Quarterly")) 745: (vector 'yearly (N_ "Yearly") (N_ "Yearly"))))) the same -- 10 + 2*5 = 20 Total: 48 So the difference got resolved. All Strings are there and the entropy is higher because we get line numbers and comments. So I would suggest to put $(XGETTEXT) --join-existing --add-comments --keyword=Q_ --keyword=N_ --keyword=_ --flag=_:1:pass-scheme-format $(SCMFILES) after the xgettext call for c-files and SCMFILES as SCMFILES = $(shell find ${abs_top_srcdir}/src -name-type l -prune test -prune -o -name '*.scm' -print ) Perhaps there is a more adequate expression for ${abs_top_srcdir}/src? Deactivate xgettext.scm and remove POTFILES.in: - intl-scm/guile-strings.c
Frank, I can't follow everything you say in detail, but do I understand you correctly that you believe a more recent intltool together with some fixes to the makefiles allows us to extract all translatable strings from both C files and SCM files using the standard xgettext tool ? And hence we can remove our custom xgettext.scm ? If that is what you say, I would certainly encourage you to do so in trunk or at least create a patch I can apply locally to test this. If at all possible I'm very much in favour of obsoleting the custom script. If I understand correctly this script prevents GnuCash translations to be handled via http://l10n.gnome.org/module/gnucash/
Created attachment 188813 [details] [review] Basic changes required to use normal gettext for scm files as well. This request continued to intrigue me. So I have played around a bit with your results. I have attached a minimal patch for the GnuCash source. This patch will add all scm files to POTFILES.in and remove guile-strings.c from the same file. In theory that should be sufficient to let gettext handle the scm files as well. Unfortunately it doesn't work due to a bug in intltool-update :( My version of intltool (0.41.1-1.fc14.noarch) seems to think scm files are some xml dialect. This is wrong. scm files are supported directly from within xgettext. But the good news is: if I use a corrected version of intltool-update, everything works fine. I have compared the output of make pot when using the old xgettext-scm based flow with the intltool based flow and the only differences are in the file comments. * old style has a lot of plain comments (#.) to indicate the original file, but in new style these are written as proper file name comment (#:) * new style includes more context comments from scm files * an number of "#, c-style" comments are removed, but that's to be expected: guile-strings.c was interpreted as a c file, while the scm files are not. My patch still needs some work to do some more cleanup, like removing all the code related to guile-strings.c, but we're on the right track here. As for the problem with intltool-update, here's what I had to change: $ diff -u /usr/bin/intltool-update my-intltool-update --- /usr/bin/intltool-update 2010-03-29 02:52:06.000000000 +0200 +++ my-intltool-update 2011-05-28 20:23:08.000000000 +0200 @@ -64,7 +64,6 @@ "ui|". # Bonobo specific - User Interface desc. files "lang|". # ? "glade2?(?:\\.in)*|". # Glade specific - User Interface desc. files (Note: .in is not required) -"scm(?:\\.in)*|". # ? (Note: .in is not required) "oaf(?:\\.in)+|". # DEPRECATED: Replaces by Bonobo .server files "etspec|". # ? "server(?:\\.in)+|". # Bonobo specific @@ -88,7 +87,7 @@ "tlk(?:\\.in)+"; # Bioware Aurora Talk Table Format my $buildin_gettext_support = -"c|y|cs|cc|cpp|c\\+\\+|h|hh|gob|py(?:\\.in)*"; +"c|y|cs|cc|cpp|c\\+\\+|h|hh|gob|py|scm(?:\\.in)*"; ## Always flush buffer when printing $| = 1; I'll file a bugreport on intltool later.
Oh, and note that I didn't have to change any xgettext parameters. That is all handled properly by intltool-update (except for the bug of course).
Ok, I have created a bug for intltool on launchpad: https://bugs.launchpad.net/intltool/+bug/790574 My request there is to alter intltool-update to use xgettext directly instead of intltool-extract when parsing scm files. xgettext supports it and there are a host of disadvantages to intltool-extract: - "N_ " isn't recognized, "N_" is though. intltool-extract is very sensitive to whitespace. Note that GnuCash mostly uses the form with a whitespace, so most strings wouldn't be detected currently. - use of intltool-extract loses all source references to the original string location in the original source file. So it's difficult to check a string in its original context. While we wait for intltool to release a new version, is there a way we can keep an internal copy of a patched intltool-update file to use ? If so, we can already proceed to eliminate xgettext-scm now instead of having to wait for intltool to fix this bug and trickle down into most distributions.
(In reply to comment #0) > xgettext.scm should also collect comments for translators. While in other > languages the translator comments work fine in scheme they are ignored. See > e.g. taxtxf.scm, which has a bunch of comments - but none arrives in the .po > files. > > intltool/NEWS says: > Version 0.32 > : > * Fix Scheme string extraction, add support for translators' comments > -- bug #137029 (Danilo Segan) > > Eventually we should test that? Just to clarify: the bug referred to here is related to a fix in intltool-extract. This makes intltool-extract work properly with scm files, provided there's no whitespace between _ and the string. Since GnuCash almost always uses whitespace, the patch doesn't fix things for GnuCash.
Created attachment 188933 [details] [review] More complete patch to use intltool for scm files This patch is an improved version of my earlier patch. Here's what it does: * removes the code that generates guile-strings.c, including xgettext-scm and all makefile rules depending on it. * add a local copy of intltool-update, patched to use xgettext directly for scm files (just as it does for c files). * make sure this local file is gnc-intltool-update is used instead of the system one (in configure.ac) * update the POTFILES.in file, so it contains proper references to the string locations in scm files. Please test it and give feedback.
Upstream [1] has applied my patch (together with some additional fixes) for the upcoming 0.42 version. Yay ! Now we will have to wait until this version is available in most distributions before I can apply my patch. This will likely only be in the next unstable series (2.7) :( [1] https://bugs.launchpad.net/bugs/790574
Apparently the new release happened today, but it became version 0.50 instead if 0.42. Just a matter of time now...
Hi Geert, the first result of intltool 0.50 is Bug 680402. ;-)
Intltool 0.50 is in Debian Testing, and since Stable is on Gtk-2.22, trunk already doesn't build there. Intltool 0.50 is also in Fedora 17, and no doubt in Ubuntu 12_04. Unless there's some significant distro that has Gtk-2.24 but not Intltool 0.50, I think you can go ahead and require Intltool 0.50 in configure and make your changes. We'll have to leave the 2.24 branch with my work-around for Bug 680402.
F15 (which is my desktop at the moment) has Gtk 2.24 but intltool 0.41.1. F17 was only just released, which means we should wait at least 6 months before adjusting.
(In reply to comment #13) > Intltool 0.50 is in Debian Testing, and since Stable is on Gtk-2.22, trunk > already doesn't build there. Intltool 0.50 is also in Fedora 17, and no doubt > in Ubuntu 12_04. > > Unless there's some significant distro that has Gtk-2.24 but not Intltool 0.50, > I think you can go ahead and require Intltool 0.50 in configure and make your > changes. We'll have to leave the 2.24 branch with my work-around for Bug > 680402. I didn't consider checking debian stable for gtk. But I agree with Derek here that it's better to wait a couple of months still before we apply the intltool patch. The distro's that currently support intltool 0.50 are still very recent, we can't expect everybody to have upgraded already. My own main desktop is still at Fedora 16 currently, though I have some other machines running on Fedora 17 also. I don't want to scare away people from building trunk by making the requirements too recent. I do expect that our final 2.6 release can safely depend on intltool 0.50 though, since I don't expect that release in less than half a year.
opensuse 12.1 still has 0.41.1, but 12.2 -targeted for mid September - will have 0.50.2
forgot gtk: gtk+-2.24.7 and gtk+-3.2.1 in opensuse 12.1 gtk+-2.24.10 and gtk+-3.4.4 for opensuse 12.2
Comment on attachment 188933 [details] [review] More complete patch to use intltool for scm files A revised version of this patch has been pushed to maint.
Will appear in gnucash 2.6.6
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=619899. Please update any external references or bookmarks.