GNOME Bugzilla – Bug 77134
Select-by-word should conform to LC_CTYPE
Last modified: 2007-12-05 02:10:02 UTC
Package: gnome-terminal Severity: normal Version: 1.9.2 Synopsis: Select-by-word should conform to LC_CTYPE Bugzilla-Product: gnome-terminal Bugzilla-Component: general Description: Description of Problem: Select-by-word is currently default -A-Za-z0-9,./?%&#, whereas my locale says I have ÃÃÃæøå as word chars too... Steps to reproduce the problem: 1. 2. 3. Actual Results: Expected Results: How often does this happen? Additional Information: ------- Bug moved to this database by unknown@bugzilla.gnome.org 2002-03-31 21:53 ------- Reassigning to the default owner of the component, hp@redhat.com.
I see no way to do this while also making it configurable as it currently is.
When making new profile read LC_* thats all...
Not a fix, but a work-around: gnome-terminal -> preferences -> select-by-word-characters -A-Za-z0-9,./?%&#_жи-іј-џ Doesn't catch quite everything, but it's a start. Make sure it's in ~/.gnome/Terminal as 'wordclass' under Config : I once did this only for tclasses, not for the default gnome-terminal and spent ages why it only worked in transparent (or whatever terminal class I had created) terminals. I'm mostly adding this for people whoi query bugzilla and need a workaround in the interim :)
Created attachment 8036 [details] using fixed(misc) 12
Created attachment 8037 [details] using fixed(misc) 13
Created attachment 8038 [details] using andale mono 72
ARG :( Wrong bug :( Should have been in bug 77130
Conceivably the right answer is to translate the .schemas file...
*** Bug 90558 has been marked as a duplicate of this bug. ***
The locale system doesn't appear to provide useful "is this a word character or not" information, so the closest you can I think you can get is to approximate it using the various g_unichar_isXXXXX() functions.
I'd suggest something like g_unichar_isgraph (c) && !g_unichar_ispunct (c) And then change the select-by-word option to "additional" select-by-word characters, and check for those too.
Of course, this is all totally broken; the idea of "word characters" doesn't produce the right word boundaries in many languages, you really have to run the huge algorithm in Pango. What we're discussing here is the equivalent of "let's just assume all text is Latin-1"
I wouldn't say "totally". Most languages use spacing characters to separate words, and this method is pretty satisfactory for those. For others, I don't think it's worth trying to solve the problem.
I totally agree with Noah. What should we do with the preferences thing then? Anybody willing to submit a patch?
We -could- remove the preference...
I think we can close this bug. We now use g_unichar_* functions to detect word chars, and only use the preference option, if set, for the ASCII range.
We have to change the default in the /schemas/apps/gnome-terminal/profiles/Default/word_chars schema to be "" so that we get the Unicode thing by default, and that's it, no?
As I said, the setting is only used for the ASCII range. So, something like "a-zA-Z_" is better than empty, because we really want '_' as a word char, at least in the terminal.