GNOME Bugzilla – Bug 350473
Skipping untranslatable messages
Last modified: 2020-03-03 18:35:01 UTC
Hi, Jens Seidel requested in Debian bug http://bugs.debian.org/381928 a new feature to exclude untranslatable messages when invoking xml2po: """ a few XML files such as passwd.1.xml in the shadow package contain untranslatable messages such as "<option>-e</option>, <option>--expire</option>". It's really useless to let translators deal with these messages. That's why I suggest a new --exclude option to xml2po which takes a PO file and does not output these messages. Maybe you can forward this bug to gettext and request a new msgdiff program which just removes messages from a PO file (the opposite of msgmerge). This would simplify the implementation in xml2po and could be reused by po4a and other tools. You could test this option with passwd.1.xml and I would like to suggest that the shadow people among others use this option. """ Bye,
Created attachment 70533 [details] [review] Ignore 'option' tag They are not fully "untranslatable". I.e. there is an "important" comma in there. xml2po already supports "ignored" tags (look at xml2po/modes/docbook.py, method getIgnoredTags() and add 'option') but it won't work in this case because of the comma. You can add 'option' to getFinalTags as well, when you'll end up with messages like "<placeholder-1/>, <placeholder-2/>" (to allow reordering and translating punctuation, or in reality, anything which is not whitespace).
Also note that we'd need to be absolutely positive about <option> never needing translation. And we can also go through all the DocBook tags and see if there are any others. (the awkwardness of the above placeholder-message can be resolved using c-format markers such as "%s, %s": there should be a bug about this already)
Marking <option> nontranslatable would indeed help. Nevertheless it doesn't solve all problems. First the user still needs to translate "%s, %s". But what about "%s, %s, %s" if three options are specified, ...? I also think this should not be restricted to options. Isn't it possible that a maintainer doesn't want translations of other messages such as license texts or special program output (e.g. large data sets)? To specify a PO file with messages to exclude is in my assumption the most powerful solution. But this requires a tool which removes one PO file from another one (creates the difference of the msgid/msgstr sets).
Would it be possible to use a XML namespace for xml2po which would add an ignore attribute, e.g. xml2po:ignore="true"?
DTDs aren't exactly intelligent, and adding a non-DocBook attribute to a DocBook file would make it fail to validate, even with a different XML namespace.
That sounded like a nice idea until you shot it down, Shaun. :) Hum, so we need to figure out a way to send special instructions to xml2po anyway (that will also be useful for things like translation disclaimers as required by GNU licenses). A simple approach would be including such namespaced attributes, and parsing a C/*.xml as well to produce one which doesn't contain those. Of course, I'd like to avoid this one, if possible.
Using an attribute to mark translatability would work for non-DTD-based formats, though, including DocBook 5. But rather than invent our own, we can just use its:translate. http://www.w3.org/TR/its/#trans-datacat
gnome-doc-utils has been superseded by yelp-xsl, yelp-tools, and itstool. gnome-doc-utils will not see any further development, hence closing as WONTFIX. See https://gitlab.gnome.org/Infrastructure/Infrastructure/issues/255