After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 568845 - Please support calling gettext() at runtime instead of shipping static translations
Please support calling gettext() at runtime instead of shipping static transl...
Status: RESOLVED OBSOLETE
Product: GConf
Classification: Deprecated
Component: gconf
2.25.x
Other Linux
: Normal enhancement
: ---
Assigned To: GConf Maintainers
GConf Maintainers
Depends on:
Blocks:
 
 
Reported: 2009-01-23 14:11 UTC by Martin Pitt
Modified: 2011-02-14 09:02 UTC
See Also:
GNOME target: ---
GNOME version: 2.25/2.26


Attachments
Ubuntu patch (10.01 KB, patch)
2009-01-23 14:15 UTC, Martin Pitt
none Details | Review
Corresponding DTD update (596 bytes, patch)
2009-01-26 08:51 UTC, Martin Pitt
none Details | Review
patch with the proposed improvements (11.80 KB, patch)
2009-02-15 04:38 UTC, Matthias Clasen
none Details | Review
handle non-local sources too (1.32 KB, patch)
2009-02-15 06:17 UTC, Matthias Clasen
none Details | Review
include bind_textdomain_codeset calls (13.23 KB, patch)
2009-04-27 04:48 UTC, Matthias Clasen
none Details | Review
intltool-merge patch (3.52 KB, patch)
2009-04-27 04:49 UTC, Matthias Clasen
none Details | Review
pick GETTEXT_PACKAGE out of Makefile (4.83 KB, patch)
2009-04-27 05:44 UTC, Matthias Clasen
none Details | Review
don't clean the gettext_domain attributes for keys not being registered (13.29 KB, patch)
2009-10-07 21:19 UTC, Sebastien Bacher
none Details | Review

Description Martin Pitt 2009-01-23 14:11:14 UTC
gconf translations are currently duplicated a lot: First, the .mo files already have them (primary source), then they are in the .schemas file in /usr/share/gconf/schemas/ (24 MB on my system), and finally they are in the gconf defaults tree (/var/lib/gconf/defaults/%gconf-tree-$LANG.xml, together ~ 90 MB on my system).

This is a pretty drastic redundancy and wastes both hard disk space as well as takes space on distribution CDs. It also imposes a gconfd startup penalty, because it has to read and parse all those translated XML files.

I want to propose support for changing this to use gettext() at runtime:

 * Drop the translations in .schemas and and instead put the translation domain into it; this needs to happen in packages' build system (see below).

 * Drop %gconf-tree-<lang>.xml entirely and instead have gconftool copy the translation domain into %gconf-tree.xml.

 * Have libgconf's get_{short,long}_description() functions use g_dgettext() at runtime, using the translation domain given in %gconf-tree.xml.

Getting descriptions is a relatively seldom operation and not needed at all for startup. If an application needs translations of its own gconf keys, its .mo file should already be loaded, thus the performance impact for this should be negligible.

The only real performance impact that will probably happen is with gconf-editor, where you need lots of translations for many different keys. But first I don't expect this to actually be noticeable by users, and second this is well worth paying this price in exchange for the general space and time savings.

This is also tracked at https://launchpad.net/bugs/123025 .
Comment 1 Martin Pitt 2009-01-23 14:15:21 UTC
Created attachment 127095 [details] [review]
Ubuntu patch

This is the patch I uploaded to Ubuntu's gconf package now. I tested it with different packages (nautilus, baobab, gnome-mount), and it works well.

Please note that this does *not* change the default behaviour, it is fully backwards compatible. If translations are shipped in %gconf-tree-$LANG.xml and do not have a gettext domain, everything works as before.

Thus the distribution's package build scripts can control whether or not they want to use static or dynamic translations.
Comment 2 Martin Pitt 2009-01-23 14:17:03 UTC
Example: nautilus with static translations:

/usr/share/gconf/schemas/apps_nautilus_preferences.schemas size:
original: 1,957,761 bytes
With patched cdbs: 43,430 bytes

(Plus a similar amount of savings in /var/lib/gconf/defaults/*.xml)

Size of nautilus-data.deb:
original: 978,728 bytes
with patched cdbs: 505,334 bytes


I checked that the gconf patch does not have a measurable performance impact by measuring the time of GNOME startup (press enter in gdm until panel menu works).

cold cache:
 - upstream gconf, per-lang tree: 76.7 s 75.5 s 75.0 s
 - patched gconf, per-lang tree: 77.1 s 74.8 s 74.9 s

hot cache:
 - upstream gconf, per-lang tree: 28.0 s 28.1 s 28.1 s
 - patched gconf, per-lang tree: 28.0 s 28.3 s 28.1 s

In theory, with a completely converted *.schemas tree, startup time should be faster, since gconfd-2 does not have to load and parse all the translated xml files any more.
Comment 3 Martin Pitt 2009-01-23 14:29:29 UTC
As for the build system changes, I discussed that with the intltool maintaines (Danilo Scheran and Rodney Dawes) as well as Vincent Untz, who ported our patches to use gettext() dynamically for .desktop/.server/.directory files to OpenSUSE.

In an ideal world, it would look like this:

intltool-merge would grow a mode to only add the gettext domain at the appropriate place instead of merging actual translations. 
 * If all distros/upstream want to use that, it should just do that by default.
 * If that is disputed, we could keep the current default behaviour and instead provide an option for that alternate mode, like --only-add-domain.
 * Introduce a new environment variable INTLTOOL_EXTRA_ARGS where distros/builders could specify --only-add-domain and thus build all GNOME packages that way wholesale without modifying the GNOME package's build systems.

However, that will take a while, since right now, GNOME tarballs have the nasty habit of shipping their own copy of intltool-merge (and others). Not that I blame you for that, it's just what intltoolize does, but that's the way it is.

Thus, until that gets fixed properly, distros like Ubuntu and OpenSUSE use some build scripts to transform .desktop/.schemas/.server etc. files accordingly, i. e. throw out translations and add the gettext domain. For .schemas in particular, I use

  sed -ri "s/^([[:space:]]*)(<locale name=\"C\">)/\1<gettext_domain>$DOMAIN<\/gettext_domain>\n\1\2/; /^[[:space:]]*<locale name=\"[^C]/,/^[[:space:]]*<\/locale>[[:space:]]*\$/ d; /^\$/d; s/<\/schema>\$/&\n/" file.schemas

(Yes, it's a crude hack, but works for most packages, including those not using intltool-merge).
Comment 4 Ray Strode [halfline] 2009-01-23 15:53:19 UTC
This change makes a lot of sense to me (I haven't really looked at the patch, though).

If you're going to update the XML format make sure you update the dtd as well.
Comment 5 Martin Pitt 2009-01-26 08:51:14 UTC
Created attachment 127238 [details] [review]
Corresponding DTD update

DTD> Good point. That's the patch.
Comment 6 Michael Meeks 2009-01-31 12:05:04 UTC
Interesting patch; I'm surprised it doesn't improve warm login time quite substantially though :-)
a few comments:

if gettext_domain is going to be common across lots of schemas (?) might be nice to use g_intern_string for it instead of g_strdup [ as it might for 'owner' I guess ] - save a few bytes.

Might it be an idea to share this code: 

+ if (REAL_SCHEMA (schema)->gettext_domain)
+    return g_dgettext(REAL_SCHEMA (schema)->gettext_domain, REAL_SCHEMA (schema)->short_desc);
+  else
+    return REAL_SCHEMA (schema)->short_desc;

with a simple inline static 'schema_translate' method [ I notice one uses g_dgettext and another dgettext directly - presumably not a feature ? ].

Looks really good (to me) otherwise.
Comment 7 Havoc Pennington 2009-02-01 16:11:40 UTC
Looks plausible to me
Comment 8 Martin Pitt 2009-02-01 18:31:55 UTC
> g_intern_string

That's not very well documented, but if it allows the function to avoid creating a zillion copies of the same domain name, it sounds great. It's not very well documented, though, so I didn't think of it. As you say, a lot of schemas usually share the same domain name.

> I notice one uses g_dgettext and another dgettext directly

Whoops, that's indeed a bug. Both should use g_dgettext(), of course.

Comment 9 Michael Meeks 2009-02-04 10:12:17 UTC
> That's not very well documented, but if it allows the function to avoid
> creating a zillion copies of the same domain name, it sounds great.

 Sure; you just have to 'know' it's there; of course - the string is never freed (need to drop the g_frees of it of course).

Another thing that would be nice here is to crunch some of the attributes by inheriting them from the parent by default; eg.

               <dir name="gnobots2">
                        <dir name="preferences">
                                <entry name="key11" mtime="1233594282" schema="/schemas/apps/gnobots2/preferences/key11"/>
                                <entry name="key10" mtime="1233594282" schema="/schemas/apps/gnobots2/preferences/key10"/>

seems a little silly - wrt. the schema name; if we could have a default-schema-path="/schemas/apps" on the <dir name="gnobots2"> node - presumably we could infer all the schema paths extremely easily, and avoid having to parse them, and/or store them in memory [ugh!]. - it still takes a chunk of CPU time to parse the whole thing on login.

Ditto mtime, and of course having a duplicate gettext_domain tag on each entry would be rather sad :-)

Anyhow - enough un-wanted bad advice from me :-)
Comment 10 Matthias Clasen 2009-02-10 17:18:37 UTC
I agree this looks like the right thing to do. 
It always saddens me to consistently see schemas files show up as the biggest files in all the gnome packages on our live cd.
Comment 11 Michael Meeks 2009-02-10 17:51:50 UTC
Agreed - my concern is only that gconfd-2 also shows up as one of the top CPU burners on login - so the concept of bubbling the domain up the tree as far as possible would be good ;-) - less XML, less parsing pain. I'm currently working on something like that for schema paths and mtime [ with some success ].
Comment 12 Michael Meeks 2009-02-12 17:51:55 UTC
the inherited attributes for mtime / schema are at:
http://bugzilla.gnome.org/show_bug.cgi?id=571449

Martin - what did you do about keys with no default - which previously used to inline all the locales in %gconf-tree.xml ? if we could get rid of those too life would be great :-)
Comment 13 Matthias Clasen 2009-02-15 04:07:58 UTC
Martin, what does your patch do with language-specific defaults ?
I don't see how that is handled.

For an example, see /apps/epiphany/dialogs/preferences_font_language
Comment 14 Matthias Clasen 2009-02-15 04:26:51 UTC
Of course, intltool does a hackjob of them in the first place, by simply using the value as a msgid:

#: ../data/epiphany.schemas.in.h:65
msgid "x-western"
msgstr "ar"


Now imagine you have several language-dependent boolean keys...

I guess that is the reason why epiphany-pango.schemas is not intltool-munged...

Anyway, your sed command above will simply throw those language-dependent defaults away. So, one answer might be: if your schema contains language-dependent defaults, don't use this hack.
Comment 15 Matthias Clasen 2009-02-15 04:38:02 UTC
Created attachment 128749 [details] [review]
patch with the proposed improvements

This patch includes the dtd change and has the improvements that Michael proposed.
Comment 16 Matthias Clasen 2009-02-15 06:17:57 UTC
Created attachment 128756 [details] [review]
handle non-local sources too

Another important omission in the patch is that it doesn't even attempt to transfer the gettext domain via corba, if we are getting the schema from the server. 

This patch fills that gap.
Comment 17 Matthias Clasen 2009-02-16 07:02:52 UTC
Some more issues when playing with this patch:

LANG=de_DE gconf-editor 

--> segfault 

because you need to call bind_textdomain_codeset for all the domains you are using


LANG=de_DE.utf8 gconf-editor

--> works for short descriptions, but not long descriptions

because long descriptions come out of gconf with mangled whitespace for some reason

Comment 18 Michael Meeks 2009-02-16 10:22:34 UTC
Wow - nice to see the fetish of utf-8 validation alive and well ;-)

Reading the code, I couldn't see why in particular we wanted to in-line the language-specific defaults in the main file - indeed, I can certainly see why we do not want to; ~30% of the file is those nasty keys - and of course, their non-ascii-ness makes them harder than the extra 30% of CPU we would expect to parse them (I suspect).

Is there really a fundamental reason why we can't move these lang-specific defaults into the relevant PO file ? it would rock if we could do that: or do applications want to be able to query the whole set of per-lang defaults in some way that would be ultimately self-defeating [ and if so why ? ;-].

Having said all this, the regression tests in HEAD are currently pretty lame at catching this sort of corner case; it'd be great to see some that cover this - whether we get 571449 in or not, it'd be nice to see the XML regression test pieces hooked up from there [ and preferably the old libxml code finally removed at the same time - it won't match the new DTD anyway ;-].
Comment 19 Martin Pitt 2009-02-16 11:24:46 UTC
Michael Meeks:
> Martin - what did you do about keys with no default - which previously used to
> inline all the locales in %gconf-tree.xml ? if we could get rid of those too
> life would be great :-)

I admit that I wasn't even aware of this case. Do you happen to have an example gconf key where this is the case? Why are they significantly different, if they are in the .schema, but just lack a <default> tag, the i18n'ing should still work as before?
Comment 20 Martin Pitt 2009-02-16 11:40:51 UTC
Matthias Claasen:
> Martin, what does your patch do with language-specific defaults ?
> I don't see how that is handled.
> For an example, see /apps/epiphany/dialogs/preferences_font_language

The patch itself doesn't do anything with it; if the <locale> tag is still there, it will just use it:

$ gconftool -g /apps/epiphany/dialogs/preferences_font_language
x-western
$ LANG=ar_AE.UTF-8 gconftool -g /apps/epiphany/dialogs/preferences_font_language
ar

This is both with the original (fully l10n'ized) .schema, as well as with the reduced .schema which looks like

        <locale name="ar">
          <default>ar</default>
        </locale>

(i. e. with <short> and <long> removed).

That's mostly because the default value lands in %gconf-tree.xml, not in %gconf-tree-$LANG.xml:

                                        <entry name="preferences_font_language" mtime="1234784232" type="schema" stype="string" owner="epiphany">
                                                <local_schema locale="ar">
                                                        <default type="string">
                                                                <stringvalue>ar</stringvalue>
                                                        </default>
                                                </local_schema>
                                                <local_schema locale="C" short_desc="The currently selected fonts language">
                                                        <default type="string">
                                                                <stringvalue>x-western</stringvalue>
                                                        </default>
[...]


However, of course I should enhance that seddery in our build system to not throw those out for now, i. e. move from "drop <locale> entirely" to:

 * drop <short> and <long> from <locale>
 * If there's anything left, leave it alone
 * If <locale> became empty, drop it entirely

Thanks for pointing this out! 
Comment 21 Matthias Clasen 2009-02-23 05:51:48 UTC
> Is there really a fundamental reason why we can't move these lang-specific
> defaults into the relevant PO file ? 

Default values are definitively handled on the server side, so if we move them to .po files, we'll have gconfd open tons of .mo files
Comment 22 Matthias Clasen 2009-04-27 04:48:02 UTC
Created attachment 133388 [details] [review]
include bind_textdomain_codeset calls
Comment 23 Matthias Clasen 2009-04-27 04:49:05 UTC
Created attachment 133389 [details] [review]
intltool-merge patch
Comment 24 Matthias Clasen 2009-04-27 04:53:02 UTC
The last patch changes intltool-merge to accept a --schemas-domain=FOO argument, which modifies the way in which schemas files are merged:

- Emit a <gettext_domain> element

- Make sure that the C strings in the merged .schemas file are munged in the same way as the extracted strings in the .pot file, avoiding roundtrip issues I mentioned earlier

- Only emit default values for other locales


Maybe it would be better to do this as a different merge target, and get the domain by parsing Makevars, etc, like intltool-update does...
Comment 25 Matthias Clasen 2009-04-27 05:44:50 UTC
Created attachment 133394 [details] [review]
pick GETTEXT_PACKAGE out of Makefile
Comment 26 Juanje Ojeda 2009-06-01 03:25:21 UTC
Hi, I'm not 100% sure, but I think that something you've changed for this bug could affect to the Gconf translations.

There is a bug #584407 which is related with this. As far I know, before the change to the .po files the Gconf descriptions were showing in the right language, now they are not translated.

I've tested it in a Ubuntu Jaunty with Gconf version 2.26.0-0ubuntu1.

If I'm wrong, please point me to the most probably direction.
Thanks
Comment 27 Matthias Clasen 2009-06-01 05:18:37 UTC
I've told pitti that his patch doesn't quite work...
Comment 28 Vincent Untz 2009-07-17 12:10:58 UTC
Is there anything blocking this?
Comment 29 Matthias Clasen 2009-07-17 13:02:09 UTC
Someone needs to work on getting the corresponding intltool patch in. Unfortunately, intltool no longer lives here, so I am not going to do it...
Comment 30 Vincent Untz 2009-07-17 13:45:33 UTC
(In reply to comment #29)
> Someone needs to work on getting the corresponding intltool patch in.
> Unfortunately, intltool no longer lives here, so I am not going to do it...

Talked to Rodney and filed https://bugs.launchpad.net/intltool/+bug/400679.
Comment 31 Sebastien Bacher 2009-10-07 21:19:15 UTC
Created attachment 144992 [details] [review]
don't clean the gettext_domain attributes for keys not being registered

The current version has an issue, it cleans all the gettext_domain attributes for the keys which are not in the schemas being currently registered, the updated version should fix that bug
Comment 32 Martin Pitt 2011-02-14 09:02:21 UTC
As the reporter of the bug I close this, as the new API du jour is gsettings, which has a sensible gettext integration.