GNOME Bugzilla – Bug 171701
Separate the help XML files in a few languages, do translation work, and merge them back
Last modified: 2006-04-18 07:05:48 UTC
I made this python script to filter out of a gimp-help-2 ml files everything but the languages choosen by a translator. The script creates a file with the same name, in another directory. I will attach it here now, and start work imediatelly on the complementar script - to merge the translation back to the main XML file.
Created attachment 39269 [details] Python script to separate langauges from main xml file Requires python 2.3 or later. I keep it on a "tools" directory in the gimp-help-2 cvs tree local copy. Seems a intuitive place to me.
Created attachment 39868 [details] The Merger script Ok, it is finally in place. I think it is even usable already. The styling issues I mentioned, when it assembles the XML back are in effect - but it is time for everyone to try it, and look for errors. I had being tested it with a variety of situations, and it is in a point were it plainly works for me - even for a different number of tags of each language in a given "cluster" of tags. I apologize for the readability of the code - mostly the variable names. Everything inside XML is "node", "node", "node"...I just run out of names. Complex as the script may seem, the algorythm is simple indeed: It peeks the "to be merged" file, and parse it recursively, until it finds a tag wich has a LANG attribute equal to the language it shall merge back At this point it traces the location of this tag - existing or being created - in the main xml File. To trace that, it just follow the XML position of the TAG being merged in terms of XML siblings, ancestors, and ancestors siblings it them adds the new tag there. It checks if the newly added tag is inside any tag with a multiple LANG attribute (like lang="en;it;fr;cs"), and inserts the language being merged in this attribute. End. -- There is a fancy thing I've called the "after languages" - it after which language in the original XML the tags that are being merged will fit. If omitted, the new language will come at the end of each "cluster" of translatable tags. If present, it may be a list, like -ait,fr,de - meaning the tags will be add after "it", if it exists, else, after 'fr' , and so on... Enjoy!
*** Bug 309901 has been marked as a duplicate of this bug. ***
Joao - what should we do with this script and this bug? I would like to add it to the gimp-help-2 module and the bug can be closed.
Ok - will undertake this again in this weekend. I had thought of a different approach which could lead to a simpler script, that would allow even more functionality.
Assigning this bug ...
Created attachment 57773 [details] One script to rule then all! A tottaly rewritten script that will work as proposed and more. It has a half usable documentation itself. - this script can separate and re-merge the GIMP-help-2 xml files, filtering out unneeded languages for editing. It creates new tags if the file does not have then in the wanted language. Moreover, this script keeps a copy (in a .helper directory on the structure) of the xml files it last saved, so, when filtering a file again, it can mark which tags had changed in the main XML (thus allowing translators to update the files without having to carefully compare each paragraph of text for changes). This won't fix the bug. The script IMHO is quite usable and helpfull, but it needs at least a little more formatting of the final XML before it can be said it is ready. By now, it would be interesting to have people testing it. It can really help translators. I almost forgot - it needs a 3rd party python module. (which will be built in in python's next version). Elementtree (which can be downloaded from: http://effbot.org/downloads/
Created attachment 57775 [details] same script, improved help. - call it xml_helper.py Introduced a short help version to avoid a lengthy when an incorrect parameter is input. Thanks to Axel Wernick for notting this.
hmm...guys, Ie been further testing this script, and I think it could be added to CVS as it is. I am now working on another tool to take care of indentation and column breaking, as well as adding the proper XML headers (DTD reference and XML version) which will later be integrated with this script. For the time being, the known bug here is that closing tags that contain a tag that was merged back are off-indented by two spaces. But adding this do CVS now would get: 1) more testers, and even some users!!, 2) What a surprise: version control! :-)
I added the scripts to the repository by the last checkin: 2006-02-20 Roman Joost <romanofski@gimp.org> * docs/makeStatistics.py: removed, because it isn't used anymore. Despite the fact it doesn't work anymore. * tools/README * tools/create_changelog.py * tools/validate_references.py * tools/xml_helper.py * tools/tests/test_validateReferences.py: added new tools for reference validation and xml editing. Can this bug now resolved as fixed?
Resolved as FIXED after nobody commented anything anymore so far.