GNOME Bugzilla – Bug 459509
MO files not optimised enough for en_GB, en_CA, etc, partially for others
Last modified: 2007-07-25 00:49:33 UTC
It is common in translation teams when a message does not need/require translation to simply copy the "msgid" content to "msgstr". For example, msgid "GConf" msgstr "GConf" This is quite common for the locales en_GB, en_CA where the vast majority of the messages remain the same. The problem arises from the fact that when compiling those PO files into MO files (with "msgfmt"), the copied messages are included, contributing to an increased size of the file on disk, and also in memory when the application is loaded. The issue is, those copied messages could have been ommited entirely in the MO file as the running application does not need them (it can use the message text that is already included in the executable). The optimal solution is to fix msgfmt so that it omits those unneeded messages. For a detailed description of the case see http://blogs.gnome.org/simos/2007/07/23/important-mo-file-optimisation-for-en_-locales-and-partly-others/ I am not sure if the developers of gettext would be ok to change the default behaviour of msgfmt. If not (therefore, some extra parameter has to be specified), then the build system should accommodate for this parameter. An alternative solution would be to assign the task to the distributions to optimise the MO files. In any case, this issue influences GNOME as a platform because less memory and disk space will be used, especially for the mobile platforms. I think it would be good to keep this bug report on GNOME bugzilla to follow the case.
*** Bug 459508 has been marked as a duplicate of this bug. ***
This sounds like an i18n problem, rather than a l10n bug. Reassigning to intltool.
This issue can also reduce the size of the Locations.xml (gweather) file. This file includes all translations of all languages in gnome-applets/po-locations/ By optimising the translation files, the Locations.xml size can be reduced from 15.2MB down to 7.6MB. This file is included in all GNOME desktop installations.
Everybody wants this to solve the Locations.xml problem. That's a wrong solution to the problem. Just use "-m" to generate per-locale Locations.xml, and provide it using language packs (that way, you can get even better memory, disk space and speed improvements). *** This bug has been marked as a duplicate of 421155 ***
Danilo, this report is not about Locations.xml. As a side-effect it would benefit Locations.xml in an easy way. It is about optimising MO files so that they occupy less space in distributions and use up less memory.
Ok Simos, then sorry. Anyway, embedded developers can easily solve the problem by stripping the messages in the PO files themselves, it's definitely not something that belongs in intltool for that reason. As mentioned in the bug report which I marked this one as a duplicate of, gettext("something") where "something" is translated to "something" (same string) in current gettext implementation, and gettext("something") with such proposed changes is not equivalent (first returns a pointer to the new string "something", while the second returns the pointer to the heap-allocated string you passed to it). I am not going to risk introducing bugs in programs because of that. If anyone does a complete enough investigation that it doesn't break any programs, I am sure Bruno would be glad to have that in GNU gettext msg* tools instead (like the changes you've did in your blog).
Danilo, I'll pursue this issue at gettext-tools first. Chusslove mentioned some valid concerns which means that msgfmt will probably not change the default behaviour, but it makes sense to add support to msgattrib. As far as I know, there is no other tool that can easily filter out messages that msgid==msgstr.