After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 167354 - Setting the language direction using a keyboard shortcut
Setting the language direction using a keyboard shortcut
Status: RESOLVED NOTABUG
Product: pango
Classification: Platform
Component: general
unspecified
Other All
: Normal enhancement
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2005-02-14 14:41 UTC by Shlomi Loubaton
Modified: 2010-11-17 15:29 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
alignment screenshot (50.04 KB, image/png)
2005-02-18 00:38 UTC, Sivan Greenberg
Details

Description Shlomi Loubaton 2005-02-14 14:41:56 UTC
Hi,
When using Hebrew in Gnome , the direction of the text is "autodetected" by the
first letter of the text. This works most of the time the way you want it, but
sometimes it doesn't and it's very annoying when it doesn't. it makes the text
almost unreadable.

The best solution , IMO (which is already implemented in QT and M$) it to let to
user override this behavior by setting a hotkey to change the direction.

(in QT and M$ the hotkeys are :
LShift+LCrtl = set LTR
RShift+RCrtl = set RTL
, maybe GTK will need another hot key to set it back to "autodetect")

For a multiline simple text box , this will usually set the direction of the
whole text. For a rich text edit box, the hotkey will set the direction of the
current line. if some text is selected it changes the direction of the all
selected line.

For simple text widgets, it's also essential to have a command in the API of the
widget that returns the direction of the text. i guess something like this
should be sufficient:
http://developer.gnome.org/doc/API/2.0/gtk/GtkEntry.html#gtk-entry-get-alignment

In Israel's Linux community , KDE is considered to have better Hebrew support
only because of the lack of this feature. As a result, people are also less
motivated to translate Gnome apps. That's why i think it's very important to add
this feature.

Shlomil.
Comment 1 Behdad Esfahbod 2005-02-14 21:44:51 UTC
Hi Shlomil,

We believe that the autodetection approach implemented in GNOME is superior to
the manual mode in other implementations.  For overriding the autodetected
direction, we recommend putting Unicode mark characters LRM and RLM.  This has
the benefit of being robust and portable, so you get the same effect when
viewing your saved text file in another complying implementations.  I recommend
you guys come up with a place for these two marks on your keyboard, perhaps in
combination with AltGr modifier.

About getting the direction of the text, in current implementation (and any
ideal implementation), different paragraphs of text may (and actually do) have
different directions, so getting the direction of the text is not as easy as you
suggest.

I suggest you bring this issue to the attention of the ivrix list, which is the
expert Israeli body in the field.

Thanks,
behdad
Comment 2 Shlomi Loubaton 2005-02-14 22:38:53 UTC
After reading your reply i still think we should use hot keys because :

1. people will never get used to the "superior" method. i don't see myself
explaining my grandmother about adding Unicode chars.

2. it doesn't seem too difficult to fix the code to do that, after you already
have the "auto detection" feature.

3. if this method is "superior" and considered as the "right way to it"(TM) so
when you're implementing the auto detection - it's actually a bug - it is not
consistent with this method of using only Unicode chars to set direction.

4. using hot keys to change direction is not a replacement to the Unicode chars
, it's an addition to the great auto detection feature (that works well 90% of
the time as i said before)

Shlomil
Comment 3 Omer Zak 2005-02-14 23:26:34 UTC
1. Why not implement the keys:
LShift+LCrtl = set LTR
RShift+RCrtl = set RTL
as keys, which insert the appropriate BiDi control characters?

2. Is there any reason not to require every text having Hebrew and/or Arabic
characters to start with a mandatory major paragraph direction control character?
This will have the effect of requiring the usage of the autodetection algorithm
only on legacy text strings, which were created before this convention was
instituted.
Comment 4 Yaacov Zamir 2005-02-14 23:38:36 UTC
Autodetection is fine, but it gives me a lot of head ache, a lot of times I want
to write text not aligned using the "superior" autodetection way, when writing
mixed text, and I can't, I need to cheat it using Unicode control chars.

Adding a manual overide to the autodetection, is a necessity for me.
Comment 5 Shai Berger 2005-02-15 00:40:29 UTC
Yaacov: 

Read Behdad's answer again. Using Unicode control characters is not "cheating",
but rather "playing by the rules".

Omer:

2. Assuming every paragraph begins with a control character, autodetection does
exactly the right thing. What do you suggest to use instead?

1. I believe you are correct, but the semantics need some working out. "set LTR"
should be implemented as "make sure the first strong char in the para is LTR,
inserting a LRM or removing a RLM as necessary", and vice versa. This leaves two
problems: One, the logical location of the cursor when it is visually at the
beginning of the para (before or after the RLM, if one is there). I think the
solution should be some "magical" way of ignoring it in that position, that is,
proceed as if it weren't there, with editing preserving it except for the
change-para-direction operations; but that may be too un-gnomish. Two, that the
two direction-settings are not really symmetrical (with the "natural" setting,
the para may change direction via normal editing, but with an "unnatural"
setting, it can't). I'm not sure this is a real problem.

Behdad:

Read Yaacov's answer again. Doing "the right thing" feels like cheating, or
hacking, and that can't be good.

Shlomil:

When you say "line" (in rich text) you mean "paragraph", right?
Comment 6 Behdad Esfahbod 2005-02-15 10:23:42 UTC
Ok, lots of fun here :).
Shai, thanks for moderating this bug.  My opinions below:

The autodetection code worked out in GNOME, is following Unicode standard as
specified here:

  http://www.unicode.org/reports/tr9/#The_Paragraph_Level

with some rules of ourselves to override P3.  These rules were derived from
discussion on the ivrix-discuss list, and Dov implemented that.  A presentation
can be found here:

  http://behdad.org/download/Presentations/bidi-layouts/

and in bug 70451.

Moreover, Unicode is an encoding for plain text, so we need to make sure that
the text file that we save to the disk, will be rendered (semantically) the same
using other complying implementations, so, we can't just let user change
paragraph direction without changing the underlying Unicode stream.

Omer, you are missing the whole point of autodetection.  Prepending all
paragraphs with marks, is like having no autodetection, and that's just bad.  I
regularly type Persian text in gedit, and of course I type English.  I really
really love it that I start typing Persian and it jumps to right!  Please lets
face it: people are used to bad habbits because Microsoft has told them "that's
it".  Your grandma hardly needs to go over the rare cases that autodetection
doesn't work, but as soon as she starts typing Hebrew, she needs to know about
how to change direction to RTL on legacy systems, but on GNOME, it just works. 
In other words, lets not forget that we are talking about the rare cases!

Next, paragraph direction is not the only weird case a user has to overcome. 
The very same problem happens around parantheses (or some other neutral
characters) between mixed LTR and RTL text.  Again, the simplest solution is to
use your keyboard and enter LRM or RLM.  So learning/teaching the philosophy
behind LRM and RLM, is IMHO the easiest solution to this problem.  But if you
don't agree, read on.

KDE has tried to be smart about parantheses and it failed miserably.  I don't go
over that for now, but would like to discuss the paragraph direcion override
case.  Shai summarized it quite well.  What I do see like a resolution to this
thread is to define actions that make sure the paragraph is rendered RTL/LTR, so
the user can bind a shortcut to the action, but for a discussion on the
impossibility of these actions, see comment 2 on bug 136529:

  http://bugzilla.gnome.org/show_bug.cgi?id=136529#c2

That should be enough for now.  I couldn't organize my points, sorry for that.
Comment 7 Shoshannah Forbes 2005-02-15 12:20:01 UTC
>We believe that the autodetection approach implemented in GNOME is superior to
>the manual mode in other implementations.  

In most cases- yes. However, in many cases, it failes (when starting with a
English word, which is rather commond when talking about product names/computer
terms etc).

>For overriding the autodetected
>direction, we recommend putting Unicode mark characters LRM and RLM. 

The lyx keyboard layout has them in shift+à and shift+è. However, it is not the
defualt Hebrew layout for X. Also, it makes a larger learning curve for users
migrating from other OS (Mac, Windows).

IMHO, what we should do is the following:
* When a user adds a Hebrew keyboard layout via the gnome panel applet, we
should default to using the lyx variant (which in addition to LRM/RLM, also
included Hebrew diacritics while the default X layout for Hebrew does not).
* For ease of migrating users, map CTRL+RightShift to RTL and CTRL+LeftShift  to
LRM.

However, there is one more issue with unicode control charachters- at the
moment, there is not option to display something there when needed, which makes
editing them and/or removeing them (when editing the text) impossible. We
porabably should have an option of "display hidden characters" which will
display something for LRM/RLM as well as CR/LF etc.
Comment 8 Yaacov Zamir 2005-02-15 12:38:22 UTC
>We porabably should have an option of "display hidden characters" which will
display something for LRM/RLM as well as CR/LF etc.

This will be a big help for me.

>people are used to bad habbits because Microsoft has told them "that's it"

I do not use ms-win, and I like autodetection most of the time,
BUT: 
1. When writing html or latex, adding Unicode chars messes things up.
2. Latin commands often start a Hebrew paragraph, in html and latex.
3. When in the middle of some hebrew text I start a paragraph with a lating
letter the paragraph jumps to the other side, and on some apps (bluefish) it
disappear.

Adding an overide may not be the right way of doing it, but it will help me.
 
Comment 9 Shlomi Loubaton 2005-02-15 12:45:59 UTC
Shai:
I think we have time solve all the paragraph/line technicalities once we've
reached broad consensus about this issue. which we havn't yet.

Behdad:

> we can't just let user change paragraph direction without 
> changing the underlying Unicode stream.

but that's exactly what auto detection does - it changes the direction without
changing the underlying Unicode stream. I don't want to remove the auto
detection feature (i like it as much as you do), I just want to be able to fix
it when auto detection is wrong.

Auto detect is also bad for Unicode chars. you might forget to insert them if
you have auto detection. so i still don't understand why auto detecting is OK
but manual override is not.

> Your grandma hardly needs to go over the rare cases that auto detection
> doesn't work, but as soon as she starts typing Hebrew, she needs to know about
> how to change direction to RTL on legacy systems, but on GNOME, it just works.
> In other words, lets not forget that we are talking about the rare cases!

These cases are not so rare. The fact is that it's the only thing that's
bothering me while using GNOME. I find it very (very!) annoying. And i know i'm
not the only one. Most people (including myself) would rather teaching their
grandmas to use KDE - where you have auto detection AND ability to override by
hot keys. 

Behdad, using Unicode chars might seem the right way, however when i edit text,
most of the time i'd like it to be text-only, simple-text data with no layout
mark of any kind. i can give you several example for it: 
* editing HTML: direction is set by CSS style. i don't want/need any RLM/LRM
char in there.
* searching for some words in a database : i don't want to search for the
Unicode chars as well , i just want to edit some text while using the right
direction.
* filling some forms on the web: if i enter my name in some field , i usually
would'nt like my name to prepend with a Unicode char.
...
..
i can find more examples for it. In all of these cases , all i want to do is
simply set the direction manually. that's all.
Using Unicode chars is good , but not always.

> "Please lets face it: people are used to bad habbits because Microsoft 
> has told them "that's it".

that has nothing to do with this issue. insisting on not implementing it just
because it's the way M$ and QT work , i'm sorry but that's just silly.

   Shlomil
Comment 10 Yaacov Zamir 2005-02-15 13:00:49 UTC
The function:
gtk_widget_set_direction (GtkWidget *widget, GtkTextDirection dir); 

typedef enum
{
  GTK_TEXT_DIR_NONE,
  GTK_TEXT_DIR_LTR,
  GTK_TEXT_DIR_RTL
} GtkTextDirection;

exist, but most apps, do not use it. is there a way of implementing it in a
lawer level, so users can change dir using some menu or key, even if the
application programer did not think about it ?
Comment 11 Yaacov Zamir 2005-02-15 13:28:30 UTC
Sorry, I've just checked, this function (gtk_widget_set_direction) has no affect
on textview and entry widgets ? why ? am I missing something ?
Comment 12 Uri David Akavia 2005-02-17 18:11:27 UTC
I've read the comment in the reffered bug, and I still don't understand the problem.

Why can't you implement a keyboard shortcut that will insert LRM or RLM at the
beginning of the paragraph? It will be my problem - if I want to write English
alligned RTL, I'll use the shortcut to insert RLM. However, if I do want that, I
*know* what I'm doing and I have good reason to override the default behavior.

I really fail to see how not implementing this function helps me.
Comment 13 Shlomi Loubaton 2005-02-17 22:54:21 UTC
Uri :
Please read my last comment again - Sometimes you just want to change the
direction without inserting any chars. i gave several examples for that and i
can give some more...

Suppose you use Bluefish(HTML editor) and you want to write a Hebrew title for
your HTML page. the line starts with "<title>" and autodetect will detect it as
LTR text. so what do you do? you insert a Unicode char , edit the line, then
delete the Unicode char? (that is, if you can guess how to do that, these chars
are invisible and that's a usability issue! .. but lets keep that for some other
bug report)

I don't understand what's the problem with implementing this feature.
No one gave me a good answer - how is hotkey override is any worse than
autodetection. 
Again: Autodetection is as bad as hotkeys regarding the underlying Unicode stream.

There are many people who are irritated by this. 
please, try to think about them too.


Shlomil
Comment 14 Sivan Greenberg 2005-02-18 00:03:26 UTC
Ok, I can also confirm this as a "bug" from the hebrew typing point of view. As
Shoshana noted, in hebrew there is a common usage of english product and terms
names within hebrew text, so, what I think would be best to also retain the
autodetection when needed but also cater for the people who want it off just for
a specific paragraph, do have a key shortcut for *temporarily* tell the
autodetection mechanism to supress the direction change for the next
autodetected direction change. 

Just to demonstrate the irrtation that this might cause I have supplied some
screenshots, becasue I sense maybe the point of our trouble was missed :)

Would such a solution would be considered by main pango maintaineres?

Thanks!

Sivan
Comment 15 Sivan Greenberg 2005-02-18 00:38:06 UTC
Created attachment 37630 [details]
alignment screenshot
Comment 16 Shai Berger 2005-02-18 01:26:22 UTC
I think that there's a consensus among the Hebrew writers on this bug that user
control over the direction is required. There is no consensus, however, on the
required semantics. Some -- myself included -- thought an implementation via
RLM/LRM was preferable, while others prefer visual-only changes. However, I
think these are different use-cases, different applications or different parts
in applications. I think the basic distinction is between texts to be saved and
texts not to be saved (e.g. Shlomi's database search example), but it's a little
more complicated.

I must say I really don't understand the HTML/LaTeX editing issues (perhaps I
just haven't edited enough of them in Hebrew -- I have never edited any RTL
LaTeX). For HTML, I think the solution should be an HTML-aware editor -- i.e.
one that understands the partition to elements and the relevant element
attributes and CSS. I think as a general strategy, we should not mix markup
systems (and LRM/RLM is markup). I don't know how this would affect LaTeX, as
I'm not aware of its own BiDi mechanisms.

Shlomi, some of your examples are contrived, though: When would you want to
enter your name in a field and have it aligned against the first strong character?
Comment 17 Behdad Esfahbod 2005-02-22 00:37:23 UTC
Got a bit or boring.  I would appreciate if people read my references before
repeating their statements again.

---- Shoshannah Forbes wrote: (comment #7)

>The lyx keyboard layout has them (RLM&LRM) in shift+� and
> shift+�. However, it is not the defualt Hebrew layout for X.

Then fix the Hebrew layout.

> Also, it makes a larger learning curve for users migrating
> from other OS (Mac, Windows).

We have chosen to minimize the learning curve for new users (so autodetecting)
rather than for migrating users.  No enhancement should sacrifice the normal
users over minority users, like migrating or advanced users.  You need to climb
the learning hill once, after that you would indefinitely enjoy the "cool
feature" GNOME has over other desktops, which is autodetection.

>  For ease of migrating users, map CTRL+RightShift to RTL[sic] 
> and CTRL+LeftShift  to LRM.

It doesn't work, since you need to be at the beginning of the paragraph for them
to change the paragraph direction.  Moreover, repeated use of them will stack up
lots of invisible characters.

> We porabably should have an option of "display hidden
> characters" which will display something for LRM/RLM
> as well as CR/LF etc.

That's definitely worth a dedicated bugzilla number, if not already assigned.


---- Yaacov Zamir wrote: (comment #8)

> 1. When writing html or latex, adding Unicode chars messes things up.
> 2. Latin commands often start a Hebrew paragraph, in html and latex.

I must confess I suffer a lot from these cases too, but do not forget that these
are markup, not plain text.  Unicode is about plain text, and gedit is about
plain text.  What you really want is a smarter markup handling pipeline, that is
aware of bidirectional scripts.  You should have seen that in my presentation
slides.  I'll open a bug on gtksourceview requesting the feature.

> 3. When in the middle of some hebrew text I start a paragraph 
> with a lating letter the paragraph jumps to the other side, and
> on some apps (bluefish) it disappear.

This looks like a bug.  Please file separately.  Apparently manual override is
not the best solution to workaround a bug.


---- Shlomi Loubaton wrote: (comment #9)

> but that's exactly what auto detection does - it changes the 
> direction without changing the underlying Unicode stream.
> I don't want to remove the auto detection feature (i like it as
> much as you do), I just want to be able to fix it when auto
> detection is wrong.

I cannot stress it more: autodetection is a higher level of complying to the
Unicode standard.  I cited the exact point of the standard in the beginning
paragraph of my previous comment.  In other words: all systems are supposed to
do autodetection, just like we do.

> These cases are not so rare. The fact is that it's the only thing
> that's bothering me while using GNOME. I find it very (very!)
> annoying. And i know i'm not the only one. Most people
> (including myself) would rather teaching their grandmas to use
> KDE - where you have auto detection AND ability to override
> by hot keys. 

I still don't why your grandma needs that.  I guess your problem is with markup
languages too.  See above.

> Behdad, using Unicode chars might seem the right way, however
> when i edit text, most of the time i'd like it to be text-only,
> simple-text data with no layout mark of any kind. i can give
> you several example for it: 

No, in a text only environment, using LRM and RLM is your only choice to choose
a direction.  If you don't want to use them, most probably you are not dealing
with plain text, but a higher protocol that handles direction differently.  So
you need an editor for this higher level protocol, not a generat text-editing
widget.

> * editing HTML: direction is set by CSS style. i don't want/need
>  any RLM/LRM char in there.

Then get an HTML editor that does what you want.

> * searching for some words in a database : i don't want to search
> for the Unicode chars as well , i just want to edit some text while
> using the right direction.

Ideally format characters like LRM and RLM should be ignored when searching
(according to the Unicode standard.)

> * filling some forms on the web: if i enter my name in some
> field , i usually would'nt like my name to prepend with a
> Unicode char.

So your name mixes Latin and Hebrew characters!?

> i can find more examples for it. In all of these cases , all i want 
> to do is simply set the direction manually. that's all.

I'm still to see one _valid_ example.

> Using Unicode chars is good , but not always.

But its your only choice.  Comments are welcome on the Unicode website, about
paragraph direction, anything.  If you mean that LRM and RLM should be
added/removed transparently, then that's another issue, currently postponed
because of technical complexity it introduces.  Again, I referenced that in my
previous comment.

> that has nothing to do with this issue. insisting on not
> implementing it just because it's the way M$ and QT work ,
> i'm sorry but that's just silly.

Please consider spending more time reading the comments and analyzing them, and
also respond in a civil manner.  I never said we will not implement it becase
other systems do that.  I said that just because other systems are doing this is
not enough reason to implement it.


---- Yaacov Zamir wrote: (comment #10 and comment #11)

> [gtk_widget_set_direction] exist, but most apps, do not use it. is
> there a way of implementing it in a lawer level, so users can
> change dir using some menu or key, even if the application
> programer did not think about it ?
> Sorry, I've just checked, this function (gtk_widget_set_direction)
> has no affect on textview and entry widgets ? why ? am I 
> missing something ?

Yes, that autodetection is used instead!  BTW, this direction _is_ used by
textview and entry widgets as a fallback.  See my slides for more information. 
So, one thing we can implement is a submenu (and API calls of course) to choose
between autodetection/LTR/RTL directions.  But note that choosing LTR/RTL will
force the paragraph direction for the whole widget (view/buffer), not a single
paragraph.


---- Uri David Akavia wrote: (comment #12)

> Why can't you implement a keyboard shortcut that will insert
> LRM or RLM at the beginning of the paragraph? It will be my 
> problem - if I want to write English alligned RTL, I'll use the 
> shortcut to insert RLM. However, if I do want that, I *know*
> what I'm doing and I have good reason to override the default
> behavior.

I'm against providing a shortcut for inserting LRM/RLM at the beginning of the
paragraph, because after that, we need to maintain these marks too.  They should
not just stack up piles of marks at the beginning of the paragraph.  See above.

> I really fail to see how not implementing this function helps me.

We have not reached any solution which is technically possible right now.


---- Shlomi Loubaton in comment #13 repeated what he has been saying in other
comments.  See above for answers.

---- Some other comments skipped because of frustration

---- Shai Berger wrote: (comment #16)

> I think that there's a consensus among the Hebrew writers on this
> bug that user control over the direction is required. There is no
> consensus, however, on the required semantics.

But it doesn't mean anything.  Again, I suggest bringing the issue to the
ivrix-discuss mailing list.  For the least, Dov Grobgeld should comment on this
issue before any bit of code is changed.

> I think the basic distinction is between texts to be saved and
> texts not to be saved (e.g. Shlomi's database search example),
> but it's a little more complicated.

It's not as a black&white distinction as you draw.  Text can and will be copied
around, from display-only buffers to savable buffers and vice versa.


Ok, done.  I'm not responding to any more reiteration of the same words and
concetps.
Comment 18 Shai Berger 2005-02-22 00:57:06 UTC
Just for the record, Dov Grogbeld has in fact commented on this issue in
ivrix-discuss, see
http://article.gmane.org/gmane.linux.region.israel.ivrix.discuss/1008
Comment 19 Behdad Esfahbod 2005-02-22 01:14:20 UTC
Oh, Ok, then lets have some discussion on the list, we can summarize here later.
Comment 20 Owen Taylor 2005-03-01 22:49:53 UTC
This has nothing to do with Pango.
 
 A) Decide on the user behavior you want
 B) Figure out how to implement it in GTK+
 C) If B) requires a Pango change, then file a new bug for that

My opinion about the Pango interaction is that if someone edits a bunch of 
text in GtkTextView, the result should be a unicode string that when loaded 
into a PangoLayout should appear identically. That is, there should be no
special non-textual attributes in the GtkTextView used to maintain paragraph 
direction.
Comment 21 Behdad Esfahbod 2005-03-01 23:41:51 UTC
Thanks Owen.  That's exactly my point too.
Comment 22 Shahar Or 2010-11-17 15:29:45 UTC
This was mentioned now in
https://bugs.launchpad.net/ubuntu/+source/empathy/+bug/571822
(Empathy allign RTL text as LTR text on messenger)
Where we wanted to decide which is the best possible way to handle RTL (RL=1 flag or the BiDi marks).

Blessings,
Shahar