GNOME Bugzilla – Bug 73339
Unnecessary markup should be removed from messages marked for translation
Last modified: 2004-12-22 21:47:04 UTC
Unnecessary markup tags should be removed from messages marked for translation. Translating things like this is pointless for many reasons: #: capplets/mouse/gnome-mouse-properties.glade.h:3 msgid "<i>Fast</i>" msgstr "" 1) The markup shouldn't be translated in the first place. That's a very good reason for why it shouldn't be marked for translation(!). It breaks the fundamental rule that what should be marked for translation are things that should be translated, and only things that should be translated. 2) Including markup in messages marked for translation makes accidental mistranslations of markup possible (where markup is translated to some other markup or invalid markup or not any markup at all), which has happened many times in the past and has "interesting" effects. 3) Including markup in messages marked for translation obfuscates the messages and make them hard to read. This is a serious problem both for translators and the people that proofread translations. 4) Including markup in messages will severely reduce the usefulness of translation databases ("translation memories") that translators use. Because there is obfuscating markup that clutters the real language content of the messages, there hardly won't be any clean matches or matches at all with existing translations of non-obfuscated messages. 5) Any type of change to the markup will make the message "fuzzy" and cause the message to have to be verified by a translator again, even if the real language content of the message didn't change. There have been many "false change" alerts from changes of the type "<b><i>Foo</i></b>" -> "<i><b>Foo</b></i>" in the past, or any other type of change where only the markup changes, is added, or removed, or reordered, or a different parameter to the markup is used. This turns out to happen quite often as developers are experimenting with layout changes. So, clearly the inclusion of things that should never be translated in messages marked for translation, and that should be translated, is causing trouble. What can be done about this? Clearly, this can't be solved for messages of this type: msgid "Current setting is <b>fast</b>" The only solution to this would be positional markup, and that is broken for many reasons i18n-wise (doesn't take into account sentence re-ordering etc). But this type of messages can be solved: msgid "<b><i>Fast</i></b>" Since the markup is centered around the actual content, these markup tags can be removed from the actual content without any i18n implications. The majority of cases where markup is problematic from the reasons I mentioned above are these short, balanced markup messages, so this should solve the majority of problems. I'm unsure if this should be done in glade or intltool though.
Hmm. This is going to be a general GTK+ problem with markup, isn't it? (Not just a Glade problem.) I agree we should keep it to a minimum. I don't like markup being used to make a few labels bold or italic. That should be a theme thing, otherwise we'll end up with a mish-mash of styles all over the place. PS. It looks like intltool has a bug - it should be converting '<' to '<' etc. in the po file. Is the translation working at all?
Appearantly there is an effort to fix this in intltool, so moving to intltool product.
Sorry, that was a misunderstanding. We are not trying to fix this in intltool, at least not yet. I was thinking of a separate issue, where Sven changed intltool to not escape the "<" and ">" characters because of a related but different issue in Gimp XML files to be localized. We need to discuss this more in the context of glade, I think.
So we need to escape them for some file types and not for others?
For Glade, we have to use entities in our XML file for '<', '>' etc. or it would get confused with our own markup, e.g. this would confuse the parser: <property name="label"><b>Fast</b></property> so we have to save as this: <property name="label"><b>Fast</b%gt;</property> But the actual string used in the interface is '<b>Fast</b>', so that should be the string in the po file. Other XML formats probably have to do the same thing, even just for including '<' characters in the strings.
will the < to < conversion be a problem for the gimp? or should we do this just for .glade2? files?
I wouldn't object if < which is not part of a markup tag would end up as < in the translation and converted back to < but I fear it will be difficult to get that right. To me that looks like a bug in the glade XML parser. Why is it confused by <property name="label"><b>Fast</b></property> which is perfectly fine XML? Perhaps you should teach the parser about the allowed markup tags instead. I don't see what's wrong with markup in translatable messages. The whole reason of introducing GMarkup was to allow just that. It solves the problem of positional markup very elegantly. If it poses a problem on translator tools, the tools should be teached about the meaning of markup in messages.
I split the < <-> < conversion problem into a bug report of its own, bug 73586.
Since this bug is so cluttered with an unrelated problem I opened a new one for the problem with markup in messages. It's in bug 97061. *** This bug has been marked as a duplicate of 97061 ***