After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 46650 - add feature which outputs multiple localized files instead of one merged file
add feature which outputs multiple localized files instead of one merged file
Status: RESOLVED FIXED
Product: intltool
Classification: Deprecated
Component: general
unspecified
Other Linux
: High enhancement
: ---
Assigned To: intltool maintainers
intltool maintainers
AP2
Depends on:
Blocks: 90500 138512
 
 
Reported: 2001-02-14 22:45 UTC by dan
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
Fixes problem with multiple-file output (2.21 KB, patch)
2003-11-10 17:59 UTC, Brian Cameron
none Details | Review
patch to write files in locale-specific directories, plus C locale in '.' (3.54 KB, patch)
2003-11-10 22:27 UTC, bill.haneman
none Details | Review
improved patch against HEAD which outputs files to <locale> directories without error messages (4.10 KB, patch)
2003-11-11 02:01 UTC, bill.haneman
none Details | Review
Output only translations when doing --multiple-output (717 bytes, patch)
2004-06-15 17:45 UTC, Danilo Segan
none Details | Review

Description dan 2001-09-10 01:00:37 UTC
Add support so that one starts with one XML data file, and then produces one XML
data file per locale.

Example XML file from scrollkeeper:
<?xml version="1.0"?>
<ScrollKeeperContentsList>
  <sect>
    <title>Applications</title>
    <sect>
      <title>Amusement</title>
    </sect>
    <sect>
       <title>Games</title>
    </sect>
  </sect>
</ScrollKeeperContentsList>

Then we produce similar files in each locale.



------- Additional Comments From darin@bentspoon.com 2001-02-15 09:28:47 ----

No idea whether this is required or where it's useful. Dan, you need to put a
bit more information about motivation in the bug reports, rather than just "do
this please".



------- Additional Comments From dan@eazel.com 2001-02-15 17:21:15 ----

We can treat this as a wish item with no particular deadline.  Maciej and
Kenneth said it sounded like a reasonable thing to do which wouldn't be hard and
that they would probably do in the not-to-distant future.  However, it isn't a
priority for Nautilus 1.0 or even GNOME 1.0.  It just makes the lives of
translators of the document categories somewhat easier.  I'm changing the target
milestone to 1.2 so it doesn't look like a 1.0 priority.



------- Bug moved to this database by unknown@bugzilla.gnome.org 2001-09-09 21:00 -------
Comment 1 Kenneth Rohde Christiansen 2002-05-02 01:02:09 UTC
This seems to be very important for Sun Microsystems. But not only for
XML files, but for all our supported fileformats 

<quote>
I'm asking because Sun's packaging policy for translated content
dictates that language content be split out into packages on a per
language basis and separate from the applications that the content is
for. So I need to create packages containing, for example, French UI
Messages, Chinese UI messages. 

Being able to do what I am asking about above would make things a hell
of a lot easier for me.
</quote>
Comment 2 Kenneth Rohde Christiansen 2002-05-02 01:13:58 UTC
This requires a lot of thinking. Is this what we want to do? and how
should we do it so we avoid a lot of the pain it will be moving to a
different GNOME i18n setup
Comment 3 Carlos Perelló Marín 2002-05-02 09:56:25 UTC
This requirement also includes .desktop files and any other file that
we use with all translations inside?
Comment 4 Damien Donlon 2002-05-02 12:01:01 UTC
>This requirement also includes .desktop files and any other file that
>we use with all translations inside?

It does yes - .desktop, .xml, .server, .keys, .directory files
all contain multiple language content which makes packaging
language content separately pretty tricky. Are there any more
I don't know about?

Are you currently in a position to bundle community language content
in separate RPMs on a per language basis for the community Gnome
releases?

Sun have the policy of separating non-English content from the base
applications for a number of reasons. One was that the language
requirements of the Solaris OS in particular were different depending
on the OS version. Another was that it allows greater flexibility in
what the user chose to install if the linguistic content is
separated. How does this currently work in the community builds?

Anyway, the files with shared content are a bit perplexing as
I cannot recall ever having encountered i18n message files with
multiple language content and building Gnome 11 times with
LINGUAS set to each language and then packaging isn't really
an option! 

Regards,
Damien


Comment 5 Kenneth Rohde Christiansen 2003-01-07 14:30:27 UTC
Doesnt seem so important anymore. Set to Low Priority
Comment 6 bill.haneman 2003-07-02 12:48:22 UTC
This is actually very important to GOK.  There are lots of other use
cases where it's important - could you please upgrade the priority?

In particular, the current approach makes parser logic much more
complex from the point of view of clients, it has bad performance
implications since we're potentially talking about >100
languages/localizations, and it reduces human-readable or
user-editable XML files to unmanageability.

GOK also has a number of XML files which are inherently
locale-specific, so per-locale/lang directories are already required.
 It makes sense to output split XML to them as well.

THis is a high priority/blocker for GOK, which is now part of the
gnome-2.4 desktop.  IN order to localize GOK properly, this bug needs
attention.
Comment 7 Damien Donlon 2003-07-02 14:55:37 UTC
I have to agree with Bill - it is still important.
Admittedly though, to get complete language separation
of application from language content is a big job given the number of
different file types involved. I'd settle for this happening for XML
as a good starting point and we are only talking about .xml files
being affected for the GOK.

I discussed the whole multilanguage content issue with a few people at
GUADEC the other week and Christian Rose and Kjarten to name just 2
could see the logic in wanting the separation. At that point we were
exploring the possibility of separating out the linguistic content
building in specific Makefile targets so that one would be able to
build and merge all linguistic content without having to build the
whole module like you do at the moment (at least as far as I am
aware). (I know you can do it with the intltools standalone but need
to know what is actually in the module first). This would be useful
for test and it would even be useful for bundling single languages.
However, because the single language .desktop files, .xml files etc.
would overwrite each other in separate rpm or SUNW package installs it
wouldn't _really_ get around the problem of being able to give users
multiple language packages and letting them install MORE than one.

Another option that was discussed for the XML (and sure Dan will give
a million reasons why this is not a good idea) was gettext
fallback for XML localised content. Well it would make for more
efficient parsing of the XML! 

Anyhow, I am all for allowing the creation of separate XML files
without much inclination on how to go about it! It should definitely
be optional.
Comment 8 bill.haneman 2003-07-02 15:06:37 UTC
Well, for GOK what we'd want it for *.xml.in => <locale>/*.xml

i.e. per-locale data diretories for these XML files.  Dunno how many
projects this would be useful for but certainly for us it makes sense
as we'd point to the locale-appropriate directory for our data files.

Question remains whether we'd implement the fallback behavior on a
per-file basis in the client (i.e. if <locale>/foo not found, use
C/foo) ; probably the answer would be "yes".

The assumption is that if a string is marked up for translation, the
client would pull strings from the <locale>/foo.xml files. 

Comment 9 Carlos Perelló Marín 2003-07-02 15:17:10 UTC
Perhaps the best option could be put them at /usr/share/locale/ like
the .mo files or perhaps just install the english version without any
other translation and get the translation directly from the .mo file
(like .glade strings), we have those translations already inside the
.po/.mo file...

Comment 10 Calum Benson 2003-08-01 14:13:52 UTC
Marking AP2 for now to reflect accessibility team's assessment, don't
think it's a GOK release stopper just yet.
Comment 11 Calum Benson 2003-08-07 16:18:50 UTC
Apologies for spam... marking as GNOMEVER2.3 so it appears on the official GNOME
bug list :)
Comment 12 Abel Cheung 2003-08-08 00:04:06 UTC
Raising priority, since gok seems to depend on this feature very badly.
Comment 13 bill.haneman 2003-09-09 13:20:53 UTC
Any progress here? It's blocking GOK localization, and GOK is part of
the GNOME 2.4 desktop.  Too late for 2.4.0 I'm afraid, but we could
get GOK localized for 2.4.1 if we get a fix for this bug in the next
couple of weeks.

Comment 14 Christian Rose 2003-09-09 20:57:51 UTC
I don't think there will be much progress on this front unless someone
(maybe Sun *hint**hint*) starts contributing patches.
intltool is in my experience very much only emergency maintained right
now, if at all maintained, so requests for new exciting development,
however necessary those areas may really be, seem extremely
unrealistic at this point unless they are also followed by patches.
Just my observation.
Comment 15 Kenneth Rohde Christiansen 2003-09-09 22:04:46 UTC
Yes I have been and still am very busy - which is very unfortunate. I
should be able to look at this in the weekend - is that too late?

Kenneth
Comment 16 bill.haneman 2003-09-10 10:33:46 UTC
Noted.  I had a look yesterday, and passed it top Brian Cameron who is
more perl-savvy than I. He's looking at this now.
Comment 17 bill.haneman 2003-09-10 10:41:13 UTC
correction, brian is looking at the other intltool bug (regarding
attribute localization), bug 116526.

Kenneth, nice to see you're on the air.  Michael Twomey at sun
suggested that the scrollkeeper post-processing stuff for splitting
files could be adapted to do this during install, instead of hacking
intltool-merge to do it.  Can you evaluate the two basic approaches
this weekend to see which seems most feasible and expedient?

IMO the window for fixing this for 2.4.1 is 2-3 weeks or so, provided
release-team accepts adding this capability in 2.4.1.  Thanks for
looking at this Kenneth.  Also, maybe you can review what Brian comes
up with for 116526 or give a hint; I personally am not a perl regex
expert, not sure how familiar Brian is with perl regex either.

- Bill
Comment 18 Kenneth Rohde Christiansen 2003-09-10 13:02:00 UTC
Yes I will evaluate these two basic approaches this weekend. I can
review the patch as well.

Kenneth
Comment 19 Kenneth Rohde Christiansen 2003-09-13 15:37:41 UTC
I added a feature to output multiple files (--multiple-output). I
don't know if this works exactly like you want it to, but it should be
pretty easy to modify it to serve your purposes.

Please test it and add a test to intltool. Just modify the last test.

Cheers, Kenneth
Comment 20 Brian Cameron 2003-10-28 17:02:27 UTC
Note that I wrote a patch to fix bug #116526 that affected this
logic.  My new logic in that patch retains the --multiple-output
functionality, though I had to rewrite it to work with an XML
parser.
Comment 21 Kenneth Rohde Christiansen 2003-10-29 00:27:24 UTC
Thanks for doing the work!
Comment 22 bill.haneman 2003-11-10 16:49:44 UTC
Brian, Kenneth:

I have some issues/questions regarding the way this is currently
implemented.

#1:
  the names seem to be <filename>.<extension>-<locale>, which means
  that the file extension is locale-dependent.  That seems broken; the 
  expected output (I thought) was one of the following:

<filename>-<locale>.<extension>

OR

<locale>/<filename>.<extension>  [i.e. the file is placed in a
locale-dependant subdirectory]

I prefer the latter solution, i.e. creating subdirectories for the
existing locales and creating files there.  It may be a little harder
to create a Makefile.am entry for that rule however.

The easiest implementation would be a third possibility:

<filename>.<locale>.<extension> which would at least keep the same
extension for purposes of identifying file types, _and_ also keep the
same filename for purposes of identifying the file itself.

Example: main.kbd.in gets translated to main.kbd-fr

Desired output: either fr/main.kbd, main-fr.kbd, or main.fr.kbd

#2: The output files appear to contain translations for locales other
than the target (not just the C locale stuff).  That seems really
broken, i.e. main.kbd-fr should not contain any elements in locale
pt_BR etc (only C locale elements, in the event that they have no
translations available).

Brian, could you look into this?  I believe that the output files
should contain either:

* only locale-specific elements, whose content falls back to C locale
if no translation is available; OR
* locale-specific elements _and_ C locale elements for untranslated
content [better].

this would mean that if a parent element had no translation in the
target locale, the parent C locale element would be used (unless there
was data matching the 'lang' but not the minor-variant, for instance
if there was pt but not pt_BR and the target was pt_BR).  At most then
there would be three of each element (C, LANG, and target locale)

IN any case I don't see why totally unrelated locale elements appear
in a locale-specific file, i.e. why Armenian (am) strings are
appearing in zh_TW (traditional Chinese) XML files.
Comment 23 Brian Cameron 2003-11-10 17:58:17 UTC
In regards to #1, the way the filename is built is very trivial to
change.  I think, though, that an intltool maintainer needs to comment
about what the ideal name should be.  With my patch, I left the naming
convention for output files the same as before.

In regards to #2, the fix for this bug is in the attached patch.
Can I commit?
Comment 24 Brian Cameron 2003-11-10 17:59:05 UTC
Created attachment 21341 [details] [review]
Fixes problem with multiple-file output
Comment 25 Kenneth Rohde Christiansen 2003-11-10 19:42:39 UTC
#1: fr/main.kbd is the prefered dir + filename; but the others are
also fine.

#2: Feel free to commit the patch.

Kenneth
Comment 26 bill.haneman 2003-11-10 20:15:05 UTC
I like fr/main.kbd too.  Brian, can you have a go at what an
appropriate Makefile.am entry would look like?  I don't know offhand
how to patch gok/Makefile.am so that it creates gok/<locales>/main.kbd

Comment 27 Brian Cameron 2003-11-10 22:00:32 UTC
patch that fixes issue #2 above has been committed.

Regarding issue #1, wouldn't it make more sense for the 
intltool-merge script to create the various subdirectories and place 
the various output files in the appropriate subdirectory.   This 
seems easier than for the Makefile.am script to create the 
subdirectories before calling intltool-merge 

The only think that would change in Makefile.am would be the various
install rules.  They would obviously need to be updated to install
the various <lang>/file.<extension> files to their appropriate 
location.
Comment 28 bill.haneman 2003-11-10 22:14:14 UTC
Hi Brian: I agree, and attach (FWIW) a patch that creates the
directories and writes the files to it, i.e. creates 'fr' and writes
'fr/main.kbd' etc.  It also writes ./main.kbd for 'C' locale.  However
that last part isn't implemented correctly since the patch causes a
number of 'uninitialized variable' warnings from perl.  Since I know
virtually nothing about perl, perhaps you can fix that last part.  I
think that even if --multiple-output is used, we should write a
./<foo> file so that code which searches for <locale>/foo can fall
back to C locale.  The other option would be to write C/foo instead,
i.e. include 'C' in the list of locales.  

thanks!
Comment 29 bill.haneman 2003-11-10 22:27:43 UTC
Created attachment 21343 [details] [review]
patch to write files in locale-specific directories, plus C locale in '.'
Comment 30 bill.haneman 2003-11-10 22:38:56 UTC
Brian, if you can fix my patch so it doesn't throw those errors I'd be
grateful.  Otherwise it seems to do what we want... seems we're nearly
to declare this bug fixed? :-)

Yes, the make install rules for the *.kbd files would have to change;
I haven't figured out how yet.  
Comment 31 bill.haneman 2003-11-11 00:55:04 UTC
I've fixed the gok Makefile.am rules, so we're ready to go when the
issue with my above patch is fixed and committed.  Ought to be a very
simple fix but you (brian) will have a better idea I think than I.
Comment 32 bill.haneman 2003-11-11 02:01:52 UTC
Created attachment 21346 [details] [review]
improved patch against HEAD which outputs files to <locale> directories without error messages
Comment 33 Brian Cameron 2003-11-11 16:26:43 UTC
Bill, it sounds like you fixed the problem with the Perl.  Do you
still need me to do anything, or is this now fixed?
Comment 34 bill.haneman 2003-11-11 17:13:30 UTC
Brian: if you think my patch is OK I will commit (based on Kenneth's
consent which I think he's already given in principle).
Comment 35 Brian Cameron 2003-11-11 18:25:17 UTC
Your patch looks good.  A couple of nit-picks.

I think you are needlessly setting the $lang variable in
the "else" case.  Also, it looks like you needlessly changed
the indentation of certain lines.

Also, you might want to check the return code from mkdir and
display a nice error message upon failure.  This way we can 
make sure that the directory was successfully created before
trying to create the output file.
Comment 36 bill.haneman 2003-11-11 18:42:39 UTC
I thought I might need to set $lang in order to suppress the error
messages.  Probably sufficient to do the "| 0" thing instead. thanks

formatting: well, there seemed to be weird tabs in the source.  I
think this is an editor weirdness since AFAIK there shouldn't be tabs
in the code, only spaces.
Comment 37 bill.haneman 2003-11-11 19:18:40 UTC
ok, it's my editor that was acting up.
I've added a check for existance of the $lang subdir and an error
message if it can't be created. Also removed the needless assignment.

thanks!
committing to cvs.
Comment 38 Danilo Segan 2004-06-15 17:42:14 UTC
There were slight incompatibilities introduced with changes to XML merging code
in intltool-merge (I broke it, so I know :).  I'll fix them right away, but I
want to do a slight improvement (IMHO) at the same time, and I want your opinion
on that.

So, now intltool (CVS HEAD) produces .kbd files with something like:
  <tag>Original string</tag>
  <tag xml:lang="lang">Translated string</tag>
where it previously produced:
  <tag>Translated string</tag>
  <tag xml:lang="lang">Translated string</tag>

I have a simple patch that makes it output only:
  <tag>Translated string</tag>

Please test current intltool CVS, and along with the following patch (will be
attached).  I'm reopening the bug until this is resolved again.

(As a sidenote, produced .kbd files should be much easier to edit this way, and
you seem to have aimed for readability, so you win more this way)
Comment 39 Danilo Segan 2004-06-15 17:45:34 UTC
Created attachment 28737 [details] [review]
Output only translations when doing --multiple-output

If GOK guys agree with this change, I guess it's fine to commit this even
though Kenneth will be unreachable for a couple of days.
Comment 40 bill.haneman 2004-06-15 17:58:33 UTC
Danilo: your patch seems to be against intltool-merge, not intltool-merge.in.in.
Did you intend that?
Comment 41 Danilo Segan 2004-06-15 18:06:51 UTC
Bill, sorry about that, it was easier for me to do it that way (I just worked on
a copy created in gok/intltool-update after ./autogen.sh).  I hoped it won't
cause a problem, but it obviously does: you may apply it against
intltool-merge.in.in, since it differs from intltool-merge only in a couple of
@MACROS@ which are replaced by sed (and not touched by the patch, of course).

Again, sorry about that.
Comment 42 bill.haneman 2004-06-15 18:10:55 UTC
Danilo: I applied your patch and confirmed that the code changes were in my
intltool-merge.  I installed intltool-merge, and reconfigured/built/installed
GOK; however your patch was NOT working as you described it (I am still seeing
an untagged, untranslated string, plus a tagged, translated string, in my
intltool-merge output).
Comment 43 bill.haneman 2004-06-15 18:16:46 UTC
Danilo: I guess for some reason my intltool-merge in gok wasn't getting replaced
when I re-gen'ed.  I applied your patch directly (as you suggest) and it did work.

However I have a bit of a concern.  If we get the strings from tagged XML, then
we can be sure of the locale that the strings are actually in; this is useful
if, for instance, we need to speak the label when activated or focussed (since
the locale/lang will determine which text-to-speech voice we can use).

Perhaps it would be better to include the tags (i.e. put tags in the translated
elements, when available).  So each element would only occur once, either with
tags (indicating that it was merged from the po files), or without tags (meaning
that the untranslated value is being used).
Comment 44 Danilo Segan 2004-06-15 19:13:29 UTC
On the problem you experienced: you need to remove intltool-* from the directory
before doing ./autogen.sh.

I'm not sure I understand you correctly.  Are you thinking of "xml:lang" when
you're talking about "tags"?  That's an option as well.

If that's what you want, just add the following line to the already patched
intltool-merge:
                 if ($MULTIPLE_OUTPUT && $translation) {
+                    print $fh " xml:lang=\"", $language, "\"";
                     print $fh ">", $translation, "</$nodename>";

If I got you wrong, please elaborate what you're thinking of.
Comment 45 bill.haneman 2004-06-16 10:10:57 UTC
Hi Danilo: Yes, adding the xml:lang attribute to the translated element is what
we had in mind.  I think that the one-line change you suggest above would be
better (i.e. preserve lang info in the strings), but I have not been able to
confirm that it works as I expect yet.  Thanks!
Comment 46 bill.haneman 2004-06-16 11:09:01 UTC
Danilo: the patch to intltool-merge, with the oneline addition you propose in
comment #44, works well for GOK.  I'd recommend changing your patch to modify
intltool-merge.in.in (of course), but otherwise the above code changes seem like
an improvement.  Thanks!
Comment 47 Danilo Segan 2004-06-16 22:28:21 UTC
Ok, I'll commit this patch (since GOK seems to be the only current user of "-m"
feature of intltool, according to grep of my Gnome CVS checkout), and adjust
intltool testcases (bug 138512 -- why I searched for this bug in the first
place).  I think Kenneth won't mind my committing this right away.

So, I'm closing this bug again.
Comment 48 bill.haneman 2004-06-16 23:04:43 UTC
OK Danilo: just to confirm, you committed the version with the line:
+                    print $fh " xml:lang=\"", $language, "\"";

and also patched intltool-merge.in.in instead of intltool-merge, right?
Comment 49 Danilo Segan 2004-06-16 23:18:24 UTC
Confirmed Bill, I comitted that version.  

There's also no "intltool-merge" in the repository so I obviously had to patch
intltool-merge.in.in -- no need to worry, I've been doing a lot of intltool
hacking lately ;)
Comment 50 bill.haneman 2004-06-17 12:02:28 UTC
Thanks Danilo; I was mostly making sure _I_ understood what was happenning here :-)