GNOME Bugzilla – Bug 167354
Setting the language direction using a keyboard shortcut
Last modified: 2010-11-17 15:29:45 UTC
Hi, When using Hebrew in Gnome , the direction of the text is "autodetected" by the first letter of the text. This works most of the time the way you want it, but sometimes it doesn't and it's very annoying when it doesn't. it makes the text almost unreadable. The best solution , IMO (which is already implemented in QT and M$) it to let to user override this behavior by setting a hotkey to change the direction. (in QT and M$ the hotkeys are : LShift+LCrtl = set LTR RShift+RCrtl = set RTL , maybe GTK will need another hot key to set it back to "autodetect") For a multiline simple text box , this will usually set the direction of the whole text. For a rich text edit box, the hotkey will set the direction of the current line. if some text is selected it changes the direction of the all selected line. For simple text widgets, it's also essential to have a command in the API of the widget that returns the direction of the text. i guess something like this should be sufficient: http://developer.gnome.org/doc/API/2.0/gtk/GtkEntry.html#gtk-entry-get-alignment In Israel's Linux community , KDE is considered to have better Hebrew support only because of the lack of this feature. As a result, people are also less motivated to translate Gnome apps. That's why i think it's very important to add this feature. Shlomil.
Hi Shlomil, We believe that the autodetection approach implemented in GNOME is superior to the manual mode in other implementations. For overriding the autodetected direction, we recommend putting Unicode mark characters LRM and RLM. This has the benefit of being robust and portable, so you get the same effect when viewing your saved text file in another complying implementations. I recommend you guys come up with a place for these two marks on your keyboard, perhaps in combination with AltGr modifier. About getting the direction of the text, in current implementation (and any ideal implementation), different paragraphs of text may (and actually do) have different directions, so getting the direction of the text is not as easy as you suggest. I suggest you bring this issue to the attention of the ivrix list, which is the expert Israeli body in the field. Thanks, behdad
After reading your reply i still think we should use hot keys because : 1. people will never get used to the "superior" method. i don't see myself explaining my grandmother about adding Unicode chars. 2. it doesn't seem too difficult to fix the code to do that, after you already have the "auto detection" feature. 3. if this method is "superior" and considered as the "right way to it"(TM) so when you're implementing the auto detection - it's actually a bug - it is not consistent with this method of using only Unicode chars to set direction. 4. using hot keys to change direction is not a replacement to the Unicode chars , it's an addition to the great auto detection feature (that works well 90% of the time as i said before) Shlomil
1. Why not implement the keys: LShift+LCrtl = set LTR RShift+RCrtl = set RTL as keys, which insert the appropriate BiDi control characters? 2. Is there any reason not to require every text having Hebrew and/or Arabic characters to start with a mandatory major paragraph direction control character? This will have the effect of requiring the usage of the autodetection algorithm only on legacy text strings, which were created before this convention was instituted.
Autodetection is fine, but it gives me a lot of head ache, a lot of times I want to write text not aligned using the "superior" autodetection way, when writing mixed text, and I can't, I need to cheat it using Unicode control chars. Adding a manual overide to the autodetection, is a necessity for me.
Yaacov: Read Behdad's answer again. Using Unicode control characters is not "cheating", but rather "playing by the rules". Omer: 2. Assuming every paragraph begins with a control character, autodetection does exactly the right thing. What do you suggest to use instead? 1. I believe you are correct, but the semantics need some working out. "set LTR" should be implemented as "make sure the first strong char in the para is LTR, inserting a LRM or removing a RLM as necessary", and vice versa. This leaves two problems: One, the logical location of the cursor when it is visually at the beginning of the para (before or after the RLM, if one is there). I think the solution should be some "magical" way of ignoring it in that position, that is, proceed as if it weren't there, with editing preserving it except for the change-para-direction operations; but that may be too un-gnomish. Two, that the two direction-settings are not really symmetrical (with the "natural" setting, the para may change direction via normal editing, but with an "unnatural" setting, it can't). I'm not sure this is a real problem. Behdad: Read Yaacov's answer again. Doing "the right thing" feels like cheating, or hacking, and that can't be good. Shlomil: When you say "line" (in rich text) you mean "paragraph", right?
Ok, lots of fun here :). Shai, thanks for moderating this bug. My opinions below: The autodetection code worked out in GNOME, is following Unicode standard as specified here: http://www.unicode.org/reports/tr9/#The_Paragraph_Level with some rules of ourselves to override P3. These rules were derived from discussion on the ivrix-discuss list, and Dov implemented that. A presentation can be found here: http://behdad.org/download/Presentations/bidi-layouts/ and in bug 70451. Moreover, Unicode is an encoding for plain text, so we need to make sure that the text file that we save to the disk, will be rendered (semantically) the same using other complying implementations, so, we can't just let user change paragraph direction without changing the underlying Unicode stream. Omer, you are missing the whole point of autodetection. Prepending all paragraphs with marks, is like having no autodetection, and that's just bad. I regularly type Persian text in gedit, and of course I type English. I really really love it that I start typing Persian and it jumps to right! Please lets face it: people are used to bad habbits because Microsoft has told them "that's it". Your grandma hardly needs to go over the rare cases that autodetection doesn't work, but as soon as she starts typing Hebrew, she needs to know about how to change direction to RTL on legacy systems, but on GNOME, it just works. In other words, lets not forget that we are talking about the rare cases! Next, paragraph direction is not the only weird case a user has to overcome. The very same problem happens around parantheses (or some other neutral characters) between mixed LTR and RTL text. Again, the simplest solution is to use your keyboard and enter LRM or RLM. So learning/teaching the philosophy behind LRM and RLM, is IMHO the easiest solution to this problem. But if you don't agree, read on. KDE has tried to be smart about parantheses and it failed miserably. I don't go over that for now, but would like to discuss the paragraph direcion override case. Shai summarized it quite well. What I do see like a resolution to this thread is to define actions that make sure the paragraph is rendered RTL/LTR, so the user can bind a shortcut to the action, but for a discussion on the impossibility of these actions, see comment 2 on bug 136529: http://bugzilla.gnome.org/show_bug.cgi?id=136529#c2 That should be enough for now. I couldn't organize my points, sorry for that.
>We believe that the autodetection approach implemented in GNOME is superior to >the manual mode in other implementations. In most cases- yes. However, in many cases, it failes (when starting with a English word, which is rather commond when talking about product names/computer terms etc). >For overriding the autodetected >direction, we recommend putting Unicode mark characters LRM and RLM. The lyx keyboard layout has them in shift+à and shift+è. However, it is not the defualt Hebrew layout for X. Also, it makes a larger learning curve for users migrating from other OS (Mac, Windows). IMHO, what we should do is the following: * When a user adds a Hebrew keyboard layout via the gnome panel applet, we should default to using the lyx variant (which in addition to LRM/RLM, also included Hebrew diacritics while the default X layout for Hebrew does not). * For ease of migrating users, map CTRL+RightShift to RTL and CTRL+LeftShift to LRM. However, there is one more issue with unicode control charachters- at the moment, there is not option to display something there when needed, which makes editing them and/or removeing them (when editing the text) impossible. We porabably should have an option of "display hidden characters" which will display something for LRM/RLM as well as CR/LF etc.
>We porabably should have an option of "display hidden characters" which will display something for LRM/RLM as well as CR/LF etc. This will be a big help for me. >people are used to bad habbits because Microsoft has told them "that's it" I do not use ms-win, and I like autodetection most of the time, BUT: 1. When writing html or latex, adding Unicode chars messes things up. 2. Latin commands often start a Hebrew paragraph, in html and latex. 3. When in the middle of some hebrew text I start a paragraph with a lating letter the paragraph jumps to the other side, and on some apps (bluefish) it disappear. Adding an overide may not be the right way of doing it, but it will help me.
Shai: I think we have time solve all the paragraph/line technicalities once we've reached broad consensus about this issue. which we havn't yet. Behdad: > we can't just let user change paragraph direction without > changing the underlying Unicode stream. but that's exactly what auto detection does - it changes the direction without changing the underlying Unicode stream. I don't want to remove the auto detection feature (i like it as much as you do), I just want to be able to fix it when auto detection is wrong. Auto detect is also bad for Unicode chars. you might forget to insert them if you have auto detection. so i still don't understand why auto detecting is OK but manual override is not. > Your grandma hardly needs to go over the rare cases that auto detection > doesn't work, but as soon as she starts typing Hebrew, she needs to know about > how to change direction to RTL on legacy systems, but on GNOME, it just works. > In other words, lets not forget that we are talking about the rare cases! These cases are not so rare. The fact is that it's the only thing that's bothering me while using GNOME. I find it very (very!) annoying. And i know i'm not the only one. Most people (including myself) would rather teaching their grandmas to use KDE - where you have auto detection AND ability to override by hot keys. Behdad, using Unicode chars might seem the right way, however when i edit text, most of the time i'd like it to be text-only, simple-text data with no layout mark of any kind. i can give you several example for it: * editing HTML: direction is set by CSS style. i don't want/need any RLM/LRM char in there. * searching for some words in a database : i don't want to search for the Unicode chars as well , i just want to edit some text while using the right direction. * filling some forms on the web: if i enter my name in some field , i usually would'nt like my name to prepend with a Unicode char. ... .. i can find more examples for it. In all of these cases , all i want to do is simply set the direction manually. that's all. Using Unicode chars is good , but not always. > "Please lets face it: people are used to bad habbits because Microsoft > has told them "that's it". that has nothing to do with this issue. insisting on not implementing it just because it's the way M$ and QT work , i'm sorry but that's just silly. Shlomil
The function: gtk_widget_set_direction (GtkWidget *widget, GtkTextDirection dir); typedef enum { GTK_TEXT_DIR_NONE, GTK_TEXT_DIR_LTR, GTK_TEXT_DIR_RTL } GtkTextDirection; exist, but most apps, do not use it. is there a way of implementing it in a lawer level, so users can change dir using some menu or key, even if the application programer did not think about it ?
Sorry, I've just checked, this function (gtk_widget_set_direction) has no affect on textview and entry widgets ? why ? am I missing something ?
I've read the comment in the reffered bug, and I still don't understand the problem. Why can't you implement a keyboard shortcut that will insert LRM or RLM at the beginning of the paragraph? It will be my problem - if I want to write English alligned RTL, I'll use the shortcut to insert RLM. However, if I do want that, I *know* what I'm doing and I have good reason to override the default behavior. I really fail to see how not implementing this function helps me.
Uri : Please read my last comment again - Sometimes you just want to change the direction without inserting any chars. i gave several examples for that and i can give some more... Suppose you use Bluefish(HTML editor) and you want to write a Hebrew title for your HTML page. the line starts with "<title>" and autodetect will detect it as LTR text. so what do you do? you insert a Unicode char , edit the line, then delete the Unicode char? (that is, if you can guess how to do that, these chars are invisible and that's a usability issue! .. but lets keep that for some other bug report) I don't understand what's the problem with implementing this feature. No one gave me a good answer - how is hotkey override is any worse than autodetection. Again: Autodetection is as bad as hotkeys regarding the underlying Unicode stream. There are many people who are irritated by this. please, try to think about them too. Shlomil
Ok, I can also confirm this as a "bug" from the hebrew typing point of view. As Shoshana noted, in hebrew there is a common usage of english product and terms names within hebrew text, so, what I think would be best to also retain the autodetection when needed but also cater for the people who want it off just for a specific paragraph, do have a key shortcut for *temporarily* tell the autodetection mechanism to supress the direction change for the next autodetected direction change. Just to demonstrate the irrtation that this might cause I have supplied some screenshots, becasue I sense maybe the point of our trouble was missed :) Would such a solution would be considered by main pango maintaineres? Thanks! Sivan
Created attachment 37630 [details] alignment screenshot
I think that there's a consensus among the Hebrew writers on this bug that user control over the direction is required. There is no consensus, however, on the required semantics. Some -- myself included -- thought an implementation via RLM/LRM was preferable, while others prefer visual-only changes. However, I think these are different use-cases, different applications or different parts in applications. I think the basic distinction is between texts to be saved and texts not to be saved (e.g. Shlomi's database search example), but it's a little more complicated. I must say I really don't understand the HTML/LaTeX editing issues (perhaps I just haven't edited enough of them in Hebrew -- I have never edited any RTL LaTeX). For HTML, I think the solution should be an HTML-aware editor -- i.e. one that understands the partition to elements and the relevant element attributes and CSS. I think as a general strategy, we should not mix markup systems (and LRM/RLM is markup). I don't know how this would affect LaTeX, as I'm not aware of its own BiDi mechanisms. Shlomi, some of your examples are contrived, though: When would you want to enter your name in a field and have it aligned against the first strong character?
Got a bit or boring. I would appreciate if people read my references before repeating their statements again. ---- Shoshannah Forbes wrote: (comment #7) >The lyx keyboard layout has them (RLM&LRM) in shift+� and > shift+�. However, it is not the defualt Hebrew layout for X. Then fix the Hebrew layout. > Also, it makes a larger learning curve for users migrating > from other OS (Mac, Windows). We have chosen to minimize the learning curve for new users (so autodetecting) rather than for migrating users. No enhancement should sacrifice the normal users over minority users, like migrating or advanced users. You need to climb the learning hill once, after that you would indefinitely enjoy the "cool feature" GNOME has over other desktops, which is autodetection. > For ease of migrating users, map CTRL+RightShift to RTL[sic] > and CTRL+LeftShift to LRM. It doesn't work, since you need to be at the beginning of the paragraph for them to change the paragraph direction. Moreover, repeated use of them will stack up lots of invisible characters. > We porabably should have an option of "display hidden > characters" which will display something for LRM/RLM > as well as CR/LF etc. That's definitely worth a dedicated bugzilla number, if not already assigned. ---- Yaacov Zamir wrote: (comment #8) > 1. When writing html or latex, adding Unicode chars messes things up. > 2. Latin commands often start a Hebrew paragraph, in html and latex. I must confess I suffer a lot from these cases too, but do not forget that these are markup, not plain text. Unicode is about plain text, and gedit is about plain text. What you really want is a smarter markup handling pipeline, that is aware of bidirectional scripts. You should have seen that in my presentation slides. I'll open a bug on gtksourceview requesting the feature. > 3. When in the middle of some hebrew text I start a paragraph > with a lating letter the paragraph jumps to the other side, and > on some apps (bluefish) it disappear. This looks like a bug. Please file separately. Apparently manual override is not the best solution to workaround a bug. ---- Shlomi Loubaton wrote: (comment #9) > but that's exactly what auto detection does - it changes the > direction without changing the underlying Unicode stream. > I don't want to remove the auto detection feature (i like it as > much as you do), I just want to be able to fix it when auto > detection is wrong. I cannot stress it more: autodetection is a higher level of complying to the Unicode standard. I cited the exact point of the standard in the beginning paragraph of my previous comment. In other words: all systems are supposed to do autodetection, just like we do. > These cases are not so rare. The fact is that it's the only thing > that's bothering me while using GNOME. I find it very (very!) > annoying. And i know i'm not the only one. Most people > (including myself) would rather teaching their grandmas to use > KDE - where you have auto detection AND ability to override > by hot keys. I still don't why your grandma needs that. I guess your problem is with markup languages too. See above. > Behdad, using Unicode chars might seem the right way, however > when i edit text, most of the time i'd like it to be text-only, > simple-text data with no layout mark of any kind. i can give > you several example for it: No, in a text only environment, using LRM and RLM is your only choice to choose a direction. If you don't want to use them, most probably you are not dealing with plain text, but a higher protocol that handles direction differently. So you need an editor for this higher level protocol, not a generat text-editing widget. > * editing HTML: direction is set by CSS style. i don't want/need > any RLM/LRM char in there. Then get an HTML editor that does what you want. > * searching for some words in a database : i don't want to search > for the Unicode chars as well , i just want to edit some text while > using the right direction. Ideally format characters like LRM and RLM should be ignored when searching (according to the Unicode standard.) > * filling some forms on the web: if i enter my name in some > field , i usually would'nt like my name to prepend with a > Unicode char. So your name mixes Latin and Hebrew characters!? > i can find more examples for it. In all of these cases , all i want > to do is simply set the direction manually. that's all. I'm still to see one _valid_ example. > Using Unicode chars is good , but not always. But its your only choice. Comments are welcome on the Unicode website, about paragraph direction, anything. If you mean that LRM and RLM should be added/removed transparently, then that's another issue, currently postponed because of technical complexity it introduces. Again, I referenced that in my previous comment. > that has nothing to do with this issue. insisting on not > implementing it just because it's the way M$ and QT work , > i'm sorry but that's just silly. Please consider spending more time reading the comments and analyzing them, and also respond in a civil manner. I never said we will not implement it becase other systems do that. I said that just because other systems are doing this is not enough reason to implement it. ---- Yaacov Zamir wrote: (comment #10 and comment #11) > [gtk_widget_set_direction] exist, but most apps, do not use it. is > there a way of implementing it in a lawer level, so users can > change dir using some menu or key, even if the application > programer did not think about it ? > Sorry, I've just checked, this function (gtk_widget_set_direction) > has no affect on textview and entry widgets ? why ? am I > missing something ? Yes, that autodetection is used instead! BTW, this direction _is_ used by textview and entry widgets as a fallback. See my slides for more information. So, one thing we can implement is a submenu (and API calls of course) to choose between autodetection/LTR/RTL directions. But note that choosing LTR/RTL will force the paragraph direction for the whole widget (view/buffer), not a single paragraph. ---- Uri David Akavia wrote: (comment #12) > Why can't you implement a keyboard shortcut that will insert > LRM or RLM at the beginning of the paragraph? It will be my > problem - if I want to write English alligned RTL, I'll use the > shortcut to insert RLM. However, if I do want that, I *know* > what I'm doing and I have good reason to override the default > behavior. I'm against providing a shortcut for inserting LRM/RLM at the beginning of the paragraph, because after that, we need to maintain these marks too. They should not just stack up piles of marks at the beginning of the paragraph. See above. > I really fail to see how not implementing this function helps me. We have not reached any solution which is technically possible right now. ---- Shlomi Loubaton in comment #13 repeated what he has been saying in other comments. See above for answers. ---- Some other comments skipped because of frustration ---- Shai Berger wrote: (comment #16) > I think that there's a consensus among the Hebrew writers on this > bug that user control over the direction is required. There is no > consensus, however, on the required semantics. But it doesn't mean anything. Again, I suggest bringing the issue to the ivrix-discuss mailing list. For the least, Dov Grobgeld should comment on this issue before any bit of code is changed. > I think the basic distinction is between texts to be saved and > texts not to be saved (e.g. Shlomi's database search example), > but it's a little more complicated. It's not as a black&white distinction as you draw. Text can and will be copied around, from display-only buffers to savable buffers and vice versa. Ok, done. I'm not responding to any more reiteration of the same words and concetps.
Just for the record, Dov Grogbeld has in fact commented on this issue in ivrix-discuss, see http://article.gmane.org/gmane.linux.region.israel.ivrix.discuss/1008
Oh, Ok, then lets have some discussion on the list, we can summarize here later.
This has nothing to do with Pango. A) Decide on the user behavior you want B) Figure out how to implement it in GTK+ C) If B) requires a Pango change, then file a new bug for that My opinion about the Pango interaction is that if someone edits a bunch of text in GtkTextView, the result should be a unicode string that when loaded into a PangoLayout should appear identically. That is, there should be no special non-textual attributes in the GtkTextView used to maintain paragraph direction.
Thanks Owen. That's exactly my point too.
This was mentioned now in https://bugs.launchpad.net/ubuntu/+source/empathy/+bug/571822 (Empathy allign RTL text as LTR text on messenger) Where we wanted to decide which is the best possible way to handle RTL (RL=1 flag or the BiDi marks). Blessings, Shahar