GNOME Bugzilla – Bug 749414
I18N ambiguous-width value for CJK locales.
Last modified: 2021-06-10 20:57:16 UTC
I'd like to set 'wide' to ambiguous-width value for CJK locales. Terminal menu "Edit" -> "Profile Preferences" -> "Compatibility" -> "Ambiguous-width characters" is default 'Narrow'. I think most CJK fonts provides 'wide' and it would be good to i18n the default value.
Created attachment 303405 [details] [review] Patch for org.gnome.Terminal.gschema.xml Attached the patch.
Comment on attachment 303405 [details] [review] Patch for org.gnome.Terminal.gschema.xml I'm fine with making the default translatable, but this isn't how i18n works with gsettings. Instead of translating the gschema file with intltool, you only need to change <default>'narrow'</default> to something like <!-- some translator message clearly WARNING the translator NOT to translate this to anything except EXACTLY LITERALLY 'narrow' or 'wide' INCLUDING the SINGLE QUOTES --> <default l10n="messages" context="default-cjk-utf8-ambiguous-width">'narrow'</default> (The translator message is from experience with the many broken translations of the 'Unnamed' string in the same schema file.) Also don't change the enum values to uppercase, that breaks existing settings.
Created attachment 303410 [details] [review] Patch for org.gnome.Terminal.gschema.xml Oh, I didn't notice glib-compiple-schemas supports the attributes. Updated the patch.
Comment on attachment 303410 [details] [review] Patch for org.gnome.Terminal.gschema.xml Thanks.
Could you commit the patch? I lost my account when GNOME switched cvs to git. Thanks.
I'm against such change. The problem is that it's not the font that really matters, but how applications believe the terminal works (so that they can keep track of the cursor position). Which is described by the wcwidth() property of each codepoint. Unfortunately there's no locale that defines those characters as wide. Setting the terminal to behave differently than the locale says is likely to cause pretty much every nontrivial app to fall apart big time. It's okay to have this option available as a setting, but I wouldn't like this to be the default. Ideally there would be variants of these locales where those characters are wide (https://sourceware.org/bugzilla/show_bug.cgi?id=4335) and gnome-terminal would default to whatever's described in the locale.
Created attachment 303423 [details] Screenshot of Midnight Commander in narrow/wide modes
Created attachment 303496 [details] Screenshot of the problem. (In reply to Egmont Koblinger from comment #7) > Created attachment 303423 [details] > Screenshot of Midnight Commander in narrow/wide modes I don't understand your example. The screenshot shows the ambiguous chars? I'm attaching the screenshot to describe the problem. The char U+2460 is shown as narrow by this option. My desktop is Fedora with VL-PGothic-Regular.ttf . I think most CJK fonts shows wide for ambiguous chars.
I understand your problem. Squeezing a bit wider glyph into a single cell (causing it to overlap with the next one) is _locally_ ugly and hard to read. Setting ambiguous to wide causes most apps to _globally_ fall apart. On mc's screen shot you should realize that the whole layout it totally wrong, probably making the application quite unusable. You've entered a couple of U+2460 (①) chars to your shell prompt (I assume it's bash) - now go ahead and remove them by pressing backspace, even that will not work incorrectly. I don't want to ship such a terminal emulator that - opposed to all the other ones - has this problem with CJK, making tons of users switch away to other emulators, and us getting plenty of bugreports about it. I understand the bug. The solution is not to change the setting in gnome-terminal, as that one breaks more than fixes. The solution would be to add the wide locales as I've linked, and then when it's done adjust gnome-terminal to use that information. Even when double-wide locales become somewhat widespread (e.g. some popular distros begin to patch their libc to ship these) I'd be fine making it a three-state dropdown in gnome-terminal: Default (which would be the default for all locales) would mean to take the info from the locale, in addition to force Narrow or force Wide.
> [...] even that will not work incorrectly. Err.. I meant "even that will not work correctly" :)
(In reply to Takao Fujiwara from comment #8) > I don't understand your example. The screenshot shows the ambiguous chars? Just to clarify: certain box drawing characters (e.g. the vertical bar) are of ambiguous width.
(In reply to Egmont Koblinger from comment #11) > (In reply to Takao Fujiwara from comment #8) > > > I don't understand your example. The screenshot shows the ambiguous chars? > > Just to clarify: certain box drawing characters (e.g. the vertical bar) are > of ambiguous width. Do you know the code points of the ambiguous vertical and horizontal bars?
The story is not about the vertical and horizontal bars, and not about mc. They were just a random example, the first that occurred to me. Every ambiguous width character is going to cause layout problems in pretty much every interactive application. By the way they're at www.unicode.org/charts/PDF/U2500.pdf. Not all of them are ambiguous wide, e.g. the vertical bar is, but the horizontal isn't.
(In reply to Egmont Koblinger from comment #13) > The story is not about the vertical and horizontal bars, and not about mc. > They were just a random example, the first that occurred to me. Every > ambiguous width character is going to cause layout problems in pretty much > every interactive application. I wonder if the application uses X11 fonts. wcwidth() might keep the legacy settings but almost CJK true type fonts would use the wide width and gnome-terminal uses true type fonts. I'm not sure if we still mind the legacy applications by default. > By the way they're at www.unicode.org/charts/PDF/U2500.pdf. Not all of them > are ambiguous wide, e.g. the vertical bar is, but the horizontal isn't. I know g_unichar_iswide() and g_unichar_iswide_cjk() are generated by DerivedEastAsianWidth.txt but I'm interested in which code points your app uses. I found wcwidth() of 0x2192, 0x2015, 0x230c and 0x2758 return 1.
Plenty of terminal applications rely on the ncurses library, which in turn respects the values as returned by wcwidth(). Those that don't use ncurses also most likely rely on wcwidth() to compute how many character cells a certain letter will consume. If they get it wrong, the screen's contains will most likely totally fall apart. There are probably a few exceptions, e.g. vim I believe uses its own method and has a config option to assume narrow vs. wide CJK. Vim not respecting the value returned by wcwidth() already causes problems with combining characters, see e.g. bug 535896, and that's not gnome-terminal's (or any other terminal's) bug, but vim's. The solution to this problem is to alter the values returned by wcwidth(), which in turn should be done by defining the corresponding locales. As long as the terminal behaves differently than described by wcwidth(), the vast majority of applications will work incorrectly. It's not "legacy" applications as you say, it's all the normal correct applications that use the proper way of figuring out whether a character will occupy 1 or 2 cells. For terminals, it doesn't matter at all what the font says. Applications running inside the terminal are not aware of the font being used.
> By the way they're at www.unicode.org/charts/PDF/U2500.pdf. Not all of them > are ambiguous wide, e.g. the vertical bar is, but the horizontal isn't. Correcting myself: apparently all those that are present in the screenshot are ambiguous wide; but I do recall some other box drawing ones being narrow. This particular screenshots contains the "light" characters from the U+2500..253F range, in particular, 2500, 2502, 2510, 2518, 25*4, 25*C – but it really doesn't matter. If any of the filenames, UI strings were ambiguous wide, it'd also further problems. Please do install and launch mc (midnight commander) with gnome-terminal's ambiguous=wide setting and try to use it (e.g. invoke the top menu with F9 and browse it). It's a total disaster. But forget mc, this was just one example. I've already pointed out that command line editing is buggy too. You could also try editing a file containing plenty of such characters in various text editors (e.g. emacs, nano, joe), I'm sure they'll all be pretty much unusable.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gnome-terminal/-/issues/7563.