GNOME Bugzilla – Bug 347110
intltool should remove surrounding markup from messages
Last modified: 2012-03-16 12:40:34 UTC
This is a request for a workaround for bug 97061, which appears will never be fixed, since that problem seemingly affects the whole gtk+, libglade, and glade toolchain. The problem is with surrounding markup, often Pango markup: msgid "<b>Example Text</b>" The surrounding markup in strings like this should not be translated, and it serves no contextual purpose for localization (the translation of the actual content of "<b>Example Text</b>" and "<i>Example Text</i>" would never need to differ) -- instead it introduces another thing that can break (more than one experienced translator has accidentally translated a string like this into "<b>Example Text</i>", causing runtime problems that are difficult to detect), it makes automated use of translation memories much more difficult and reduces the usefulness of those (translations cannot be reused), and are in every way a pain for the localization process and translators. Danilo introduced the idea that intltool can workaround this issue, since no fix is in sight for gtk+, libglade, or glade. If intltool-extract detects "surrounding markup" and removes it from messages at extraction time, and intltool-merge then does the reverse thing, keeping track of the original string with surrounding markup, the original string without surrounding markup, and the translated message, then it can add the surrounding markup to the translated message and place it in the files again. Please note that there needs to be a distinction between the following cases of markup in strings: Positional markup (rare): "This text is <b>bold</b>." Surrounding markup (common): "<span size="smaller"><b>Settings:</b></span>" "<b>Example Text</b>" Combination (extremely rare): "<span size="smaller">Save the settings by clicking on the <b>Save</b> button.</span>" Strings with positional markup should not be touched. The markup info is needed to introduce the markup at the correct position when translating. Strings with surrounding markup should be altered so as to remove the surrounding markup (but to keep any positional markup). It should be possible to distinguish between these cases with an XML parser, right?
Not easily with an XML parser, no. What is the benefit really? It was once mentioned to me that being able to translate <b> and such, is actually something that can be useful. Not all character sets or fonts work well with, or support, bold characters. Being able to translate that to larger type, or similar, seems useful to me. What exactly would the benefits be of your proposal? It basically means making translated markup much more complicated, than it currently is. I don't see a particular need to do this now. It's not harder to translate with the markup there as-is. I think there are much more important things to deal with now, such as figuring out methods for doing translations-per-package and such, for vendors.
I totally agree with Christian Rose. I am a translator myself. There has never been a case for changing the markup as far as I know. Having the markup (at least the external one) removed from the string makes the tasks of the translators easier and less error prone.
just to have a different opinion, i'm going to quote from http://live.gnome.org/GnomeI18nDeveloperTips : Following is a list of examples that need to be marked for translation, but were not in some cases: [...] "<b>%s</b>": That is an innocent way to mark something to make it boldface in the interface, to emphasize importance or make it a header. But not every language has a concept of modern boldface typefaces, or even if it has such fonts, they may not be the preferred font for such kind of emphasis.
I don't know about other languages but I'm certain Vietnamese don't need to change those tags. If it can't be done for all languages, can I mark Vietnamese translations for intltool to ignore tags?
In this theoretical case of a language that would want different markup, I'd hope that GTK+/pango would know what to do. If the language doesn't have bold typefaces then surely it just won't try to show what it can't show. I guess Owen Taylor or Behdad will know if this is a real need.
FWIW, I support the request :) Any font substitution belongs to other places.
Just to make it clear to everybody, and most of all, Rodney. I proposed to Christian to report a bug against intltool. The thing is that the solution in other places is not happening, and while intltool is designed to help developers have their software translated, it's also there to help translators do the work. And as indicated by the sheer number of interested parties (most of whom are translators: you can also check the support for this on gnome-i18n list), it will really help them. So, the idea is this: 1. we have to provide the same PO/MO files to the program as we are providing now, otherwise the program will break (end up untranslated) 2. we have to provide "stripped" version of PO files to translators Problems with this approach: 1. intltool-merge is creating a cache directly from the PO files: needs to be changed to use unstripped content when creating cache 2. we'd have to modify makefile rules to create MO files from unstripped PO files So, the solution I imagine is to write a simple program which will strip surrounding markup from PO files, and which will reinsert it (based on an unstripped POT file). This should also work with standard I/O, so that we can connect it seamlessly with current stuff. We can reuse the PO parsing stuff from intltool-merge (where it creates the cache). I understand that this will complicate intltool logic a bit, but I really feel it's worth it. Granted, I plan to work on this myself, so I'd also like comments from anyone familiar with intltool internals (yes, Rodney, that means you ;).
intltool has switched from the GNOME to the launchpad.net infrastructure nearly three years ago: https://mail.gnome.org/archives/gnome-i18n/2009-April/msg00275.html The intltool product in bugzilla.gnome.org has been deprecated and closed for new bug entry since April 2009. I am now closing all remaining open reports about intltool as NOTGNOME as part of GNOME Bugzilla Housekeeping. Reporter: If the problem that you reported here is still valid in a recent version of intltool we kindly ask you to report it again to https://bugs.launchpad.net/intltool/ so the intltool developers get notified about it.