After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 459509 - MO files not optimised enough for en_GB, en_CA, etc, partially for others
MO files not optimised enough for en_GB, en_CA, etc, partially for others
Status: RESOLVED DUPLICATE of bug 421155
Product: intltool
Classification: Deprecated
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: intltool maintainers
intltool maintainers
: 459508 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2007-07-23 11:40 UTC by Simos Xenitellis
Modified: 2007-07-25 00:49 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Simos Xenitellis 2007-07-23 11:40:27 UTC
It is common in translation teams when a message does not need/require
translation to simply copy the "msgid" content to "msgstr". For example,

msgid "GConf"
msgstr "GConf"

This is quite common for the locales en_GB, en_CA where the vast
majority of the messages remain the same.

The problem arises from the fact that when compiling those PO files into
MO files (with "msgfmt"), the copied messages are included, contributing
to an increased size of the file on disk, and also in memory when the
application is loaded. The issue is, those copied messages could have
been ommited entirely in the MO file as the running application does not
need them (it can use the message text that is already included in the
executable).

The optimal solution is to fix msgfmt so that it omits those unneeded messages.

For a detailed description of the case see
http://blogs.gnome.org/simos/2007/07/23/important-mo-file-optimisation-for-en_-locales-and-partly-others/

I am not sure if the developers of gettext would be ok to change the default behaviour of msgfmt. If not (therefore, some extra parameter has to be specified), then the build system should accommodate for this parameter.

An alternative solution would be to assign the task to the distributions to optimise the MO files. In any case, this issue influences GNOME as a platform because less memory and disk space will be used, especially for the mobile platforms.

I think it would be good to keep this bug report on GNOME bugzilla to follow the case.
Comment 1 Simos Xenitellis 2007-07-23 11:48:01 UTC
*** Bug 459508 has been marked as a duplicate of this bug. ***
Comment 2 Christian Rose 2007-07-23 12:01:55 UTC
This sounds like an i18n problem, rather than a l10n bug.
Reassigning to intltool.
Comment 3 Simos Xenitellis 2007-07-23 13:21:27 UTC
This issue can also reduce the size of the Locations.xml (gweather) file. This file includes all translations of all languages in gnome-applets/po-locations/
By optimising the translation files, the Locations.xml size can be reduced from 15.2MB down to 7.6MB.

This file is included in all GNOME desktop installations.
Comment 4 Danilo Segan 2007-07-23 17:08:45 UTC
Everybody wants this to solve the Locations.xml problem. That's a wrong solution to the problem. Just use "-m" to generate per-locale Locations.xml, and provide it using language packs (that way, you can get even better memory, disk space and speed improvements).


*** This bug has been marked as a duplicate of 421155 ***
Comment 5 Simos Xenitellis 2007-07-23 18:27:59 UTC
Danilo, this report is not about Locations.xml. As a side-effect it would benefit Locations.xml in an easy way.

It is about optimising MO files so that they occupy less space in distributions and use up less memory. 
Comment 6 Danilo Segan 2007-07-24 23:34:23 UTC
Ok Simos, then sorry. Anyway, embedded developers can easily solve the problem by stripping the messages in the PO files themselves, it's definitely not something that belongs in intltool for that reason.

As mentioned in the bug report which I marked this one as a duplicate of, gettext("something") where "something" is translated to "something" (same string) in current gettext implementation, and gettext("something") with such proposed changes is not equivalent (first returns a pointer to the new string "something", while the second returns the pointer to the heap-allocated string you passed to it).

I am not going to risk introducing bugs in programs because of that. If anyone does a complete enough investigation that it doesn't break any programs, I am sure Bruno would be glad to have that in GNU gettext msg* tools instead (like the changes you've did in your blog).
Comment 7 Simos Xenitellis 2007-07-25 00:49:33 UTC
Danilo, I'll pursue this issue at gettext-tools first. Chusslove mentioned some valid concerns which means that msgfmt will probably not change the default behaviour, but it makes sense to add support to msgattrib.
As far as I know, there is no other tool that can easily filter out messages that msgid==msgstr.