After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 730343 - Verify cleanup_tab_fragments usage in vteseq
Verify cleanup_tab_fragments usage in vteseq
Status: RESOLVED FIXED
Product: vte
Classification: Core
Component: general
0.37.x
Other Linux
: Normal normal
: ---
Assigned To: VTE Maintainers
VTE Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-05-18 19:55 UTC by Egmont Koblinger
Modified: 2014-05-22 23:52 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Fix v1 (12.86 KB, patch)
2014-05-21 20:46 UTC, Egmont Koblinger
committed Details | Review
Fix drawing the cursor over a tab (683 bytes, patch)
2014-05-22 22:50 UTC, Egmont Koblinger
committed Details | Review
Fix when start == end (1.67 KB, patch)
2014-05-22 23:33 UTC, Egmont Koblinger
committed Details | Review

Description Egmont Koblinger 2014-05-18 19:55:53 UTC
In vteseq.c randomly a couple of cursor operations call _vte_terminal_cleanup_tab_fragments_at_cursor(), but others (including e.g. restore_cursor, scroll_up [the cursor doesn't move, but the underlying content does] etc.) don't.

Verify if it's needed. Probably it's either needed everywhere, or can be dropped.
Comment 1 Egmont Koblinger 2014-05-21 13:41:45 UTC
$ echo -e '\t\txy\e[1G' (\e[1G moves cursor to the beginning of the line)
-> result is 8 spaces + 1 tab (\e[1G breaks the first tab into spaces)

$ echo -e '\t\txy\r'
-> result is two tabs (the first one is not broken into spaces after a CR)

$ echo -e '\t\txy\r\e[P' (CR, and then delete a character)
-> visually 15 spaces, but highlighing with mouse behaves weird, actually copies 1 tab only, and after a window resize it's also just 1 tab)

Probably there are even more serious corruption possibilities, I don't know yet.

xterm doesn't seem to provide support of tabs this way, the screen never contains tabs, only spaces. I guess it's fully considered a control character. It's a cool convenience feature that g-t remembers where tabs were printed (for copy-pasting) but causes quite a bit of headache.

I don't think calling _vte_terminal_cleanup_tab_fragments_at_cursor() after a cursor movement (so that the cursor never stands over any part of a tab character) is a good idea. Just because someone uses \e[1G or similar to move around, we shouldn't lose tabs. Also, it has the question whether relative cursor movement (like "\e[10A") should split tabs into spaces all along its path (but why?) or only at the final position (making it different from "\e[A" repeated 10 times)? There are just way too many cursor moving operations and we'd sure forget about some of them.

I think tabs should be split to spaces whenever the content is modified there, e.g. a letter gets printed, a character gets deleted (\e[P) or inserted (\e[@) – anything else?

Also we should make sure that printing/deleting/inserting a character in the middle of a CJK doesn't cause corruption. (Related: bug 691972)
Comment 2 Egmont Koblinger 2014-05-21 13:55:12 UTC
Another beautiful case of a minor corruption is when the 2nd cell of a CJK runs into a TAB:
echo -e '\t\txy\r1234567\u4200'

Visually there are 7 spaces between the CJK and "xy", but actually there's nothing if copy-pasted, and disappears on rewrap-on-resize.

I'm sure printing a TAB can also run into a CJK or into another TAB with different tab stops :P
Comment 3 Behdad Esfahbod 2014-05-21 18:41:17 UTC
Agreed...
Comment 4 Behdad Esfahbod 2014-05-21 18:42:36 UTC
Based on my experience with HarfBuzz and other libraries: we can immensely clean up ring / buffer abstraction and implementation if we decide to use C++ internally.  Perhaps vala can provide the same, donno.  But if we could use C++ we could hide a lot of the complexities of fetching the VteRow, fetching the cell, cleaning up around it, etc...
Comment 5 Christian Persch 2014-05-21 18:53:41 UTC
vala is fine for simple tools where efficiency doesn't matter, but not here, IMHO. (Don't — really, DON'T! — look at the vala-generated C code!)

Even just using __attribute__((__cleanup__(func))) makes C immensely more enjoyable IME, but yes, C++ is even nicer.
Comment 6 Egmont Koblinger 2014-05-21 20:44:47 UTC
(In reply to comment #5)
> vala is fine for simple tools where efficiency doesn't matter, but not here,
> IMHO. (Don't — really, DON'T! — look at the vala-generated C code!)

I did the other day, figuring out how transparency could be migrated from app.vala to g-t :)

/me really don't like C++. (But I don't mind too much if you migrate.)

This particular bug is about an issue where C++ wouldn't help at all. Most of the terminal emulation is such an area - interpreting control sequences in vteseq, rewrapping on resize, figuring out how to handle a CJK cut in half etc... wouldn't be easier in C++ than in plain C.

The ring code, the to-be-removed matcher->trie/table indirection etc., sure, yes.
Comment 7 Egmont Koblinger 2014-05-21 20:46:22 UTC
Created attachment 276952 [details] [review]
Fix v1

Here's what I had in mind to fix the Tab/CJK issues.

I'm yet to test with tons of escape sequences, but trying a few of them looked promising.
Comment 8 Egmont Koblinger 2014-05-21 20:51:59 UTC
Note to my future self, lest I forget:

A Tab is *not* N ∈ {1..8} spaces! A Tab is N ∈ {1..8} "cursor right" operations.

If there's any content underneath, it's reserved, and the special "smart tabs" (where it's preserved for copy-pasting) does not kick in.

(And, as a lucky side effect, cleanup of existing Tabs/CJKs is not required when a Tab is printed.)
Comment 9 Behdad Esfahbod 2014-05-21 21:13:51 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > vala is fine for simple tools where efficiency doesn't matter, but not here,
> > IMHO. (Don't — really, DON'T! — look at the vala-generated C code!)
> 
> I did the other day, figuring out how transparency could be migrated from
> app.vala to g-t :)
> 
> /me really don't like C++. (But I don't mind too much if you migrate.)
> 
> This particular bug is about an issue where C++ wouldn't help at all.

It actually does.  If the screen buffer (rows, colls, etc) are properly encapsulated C++ classes, we can cleanup_fragments every time a cell is modified, without having to spread the calls all over the code base...

(In reply to comment #7)
> Created an attachment (id=276952) [details] [review]
> Fix v1
> 
> Here's what I had in mind to fix the Tab/CJK issues.
> 
> I'm yet to test with tons of escape sequences, but trying a few of them looked
> promising.

How about when cursor is moved using escape sequences but then normal text is received?  I think that's why originally I implemented by cleaning up whenever cursor is moved.
Comment 10 Egmont Koblinger 2014-05-21 21:29:47 UTC
(In reply to comment #9)
> It actually does.  If the screen buffer (rows, colls, etc) are properly
> encapsulated C++ classes, we can cleanup_fragments every time a cell is
> modified, without having to spread the calls all over the code base...

I could even do now in vterowdata.c in all the _vte_row_data_blah() methods. I was even thinking about this for a while.

The problem is updating the display. When a CJK is cut in half, you need to update 1 extra charcell. I really wouldn't want vterowdata to know anything about invalidating cells. It can propagate back this data as a retval but then the caller would have to take care of this. Or one should always update 1 extra cell to the left and 1 to the right, just in case.

I'm afraid that introducing "proper" OO approach to solve these kinds of problems (whatever that approach would be) might easily significantly slow down VTE. But I don't have real experiences in this area.

> How about when cursor is moved using escape sequences but then normal text is
> received?  I think that's why originally I implemented by cleaning up whenever
> cursor is moved.

See comment 1. With my patch, the Tab/CJK cleanup occurs right before processing the normal text. There are a plenty of places in the source where the cursor moves (most of them already forgot to do the cleanup), and only a few where the text changes (in which case you need to do the cleanup anyways). Also, I see no reason why a cleanup should occur just because the cursor is moved somewhere (and who knows, maybe another cursor movement will follow; why break a tab if the cursor innocently walks over it?)
Comment 11 Behdad Esfahbod 2014-05-21 21:31:39 UTC
Got it.  Thanks.
Comment 12 Egmont Koblinger 2014-05-22 22:50:37 UTC
Created attachment 277016 [details] [review]
Fix drawing the cursor over a tab

Now that a tab under the cursor is no longer split into spaces, we need to watch out when painting that cursor. Trivial patch, goes on top of the previous.
Comment 13 Egmont Koblinger 2014-05-22 23:33:21 UTC
Created attachment 277017 [details] [review]
Fix when start == end

Splitting a character in two for insertion (that is, both halves of the same character need to be properly cleaned up) is a bit trickier than I first thought.
Comment 14 Egmont Koblinger 2014-05-22 23:52:39 UTC
All tested and looks good.
Pushed as commit 4d6edfb7ecd9bc2743fe507794092ff096d50aac.