GNOME Bugzilla – Bug 772890
Use wcwidth instead of g_unichar_iswide
Last modified: 2021-06-10 15:17:07 UTC
Inspired by bug 762052 and bug 772812: Every once in a while wcwidth() and g_unichar_iswide() disagree whether a character is wide or not. This can be caused by changes in the Unicode standard (and lack of versioning in these methods), or by implementation bugs. In order for apps running inside the terminal emulator not to fall apart, it's important that the app has the same belief about width than the terminal emulator itself. Apps are way more likely to use the generic wcwidth() (most of the indirectly, e.g. via ncurses) than the glib-specific g_unichar_iswide(). So if we also used the generic wcwidth(), such breakages would be less frequent. I guess the same should go for zero-width/combining chars too.
From the wcwidth docs: "The behavior of wcwidth() depends on the LC_CTYPE category of the current locale." Since that locale would be the g-t-server's locale, not the one of the programme running inside the terminal, I don't think this is the correct solution.
I think it theoretically might depend on the locale but actually in glibc it does not (at least as long as UTF-8 locales are concerned). Even the proposal to have ambiguous-is-wide locale definitions (which would differ here) was refused (or is at least put aside for now). There is no "correct solution", e.g. there's no way to guarantee full consistency across an ssh session. But most of the time I think this approach would result in a better behavior than the current one. (Also let's not forget that almost all users use a single locale only, that is, g-t-s runs with the same one as the apps inside.)
Just for convenience: Here are the widths for Unicode 8.0 and 9.0, there are many differences: ftp://ftp.unicode.org/Public/8.0.0/ucd/EastAsianWidth.txt ftp://ftp.unicode.org/Public/9.0.0/ucd/EastAsianWidth.txt We're keep getting reports here as well as on stackoverflow+friends...
Even if we did switch to using wcwidth, that still wouldn't guarantee the right results, since unless the terminal application is very very unicode-savy, it probably takes wcwidth per-character and doesn't consider that (e.g. for emoji) a later character may change the width of the preceding character.
Buggy programs being broken doesn't exactly seem as bad as properly written programs being very broken, like currently is the case. Having to reset my terminal relatively often because of this bug is getting rather old.
I really don't get what you're trying to say. What makes a program "buggy" vs "properly written" according to your definition?
@commenter 5: If your distribution's glibc is still using unicode <= 8 data, maybe they should also keep glib at a version that uses the same unicode data. --- Another problem with just using the host's wcwidth is that when you e.g. ssh to another host, that one's wcwidth may still be using a different unicode version. We could stop using glib and glibc for this, include the wcwidth data for multiple unicode versions (<= 8 and >= 9, at least), and have a sequence to switch between them (like iterm2 has). The vte.sh integration could then probe the host's wcwidth and emit the sequence to switch to a matching wcwidth. Still, that seems overkill to me...
> seems overkill to me... To me too. Hopefully it won't be frequent that the widths change. By now, I think all major Linux distros have fully switched to Unicode 9.0 (both glibc and glib), or going to switch real soon, making this bug kinda obsolete. (That being said, I'd still prefer vte relying on wcwidth rather than g_whatever, but it's quickly losing its practical importance.) > The vte.sh integration could then probe the host's wcwidth which is tricky 'cause VTE_VERSION is not forwarded by ssh.
glibc's wcwidth() is about to deviate from the Unicode standard... let's hope we can stop this crazyness! https://sourceware.org/bugzilla/show_bug.cgi?id=21750 comments 5-7 for the time being.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/vte/-/issues/2350.