After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 787701 - Add REP support
Add REP support
Status: RESOLVED FIXED
Product: vte
Classification: Core
Component: general
git master
Other Linux
: Normal major
: vte-0-52
Assigned To: VTE Maintainers
VTE Maintainers
: 788121 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2017-09-14 22:26 UTC by Egmont Koblinger
Modified: 2017-10-05 12:03 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Fix v1 (3.31 KB, patch)
2017-09-17 20:42 UTC, Egmont Koblinger
none Details | Review
Fix v2 (4.01 KB, patch)
2017-09-18 12:44 UTC, Egmont Koblinger
none Details | Review
New xterm description (3.50 KB, application/octet-stream)
2017-09-18 12:58 UTC, Egmont Koblinger
  Details

Description Egmont Koblinger 2017-09-14 22:26:34 UTC
Looks like a recent ncurses change pisses off quite a few terminal emulators. See https://bugs.kde.org/show_bug.cgi?id=384620. I haven't tried but I'm pretty sure we're also affected.

TERM=xterm-256color (our default) now includes xterm-new which supports this property, and ncurses uses it for line drawing.

We'll need to implement "CSI Ps b  Repeat the preceding graphic character Ps times (REP)."
Comment 1 Egmont Koblinger 2017-09-14 22:29:46 UTC
Seems to me that xterm(324) can only repeat characters of Latin-1 as well as box drawing ones. Even in UTF-8 mode, this escape sequence is a nop if emitted after characters outside of Latin-1.

If a combining accent is present, it's ignored and only the base character is repeated.

Why, oh why???
Comment 2 Egmont Koblinger 2017-09-14 22:40:00 UTC
Haha, and it cannot repeat the raw UTF-8 version of box drawing chars.

Now I'm truly wondering if ncurses apps with NCURSES_NO_UTF8_ACS=1 break in xterm, that would be truly fun :D I'll have to upgrade my terminfo to test this.
Comment 3 Egmont Koblinger 2017-09-15 11:28:49 UTC
Oops, I thought NCURSES_NO_UTF8_ACS=1 did something different, I though it emitted raw UTF-8 line drawing characters (which, of course, would totally contradict its name). It's not the case, it falls back to simple ASCII +, -, | and similar chars. I'm not sure if ncurses is able to emit raw UTF-8 U+2500 and friends for box drawing. So forget my previous comment.
Comment 4 Egmont Koblinger 2017-09-15 11:38:10 UTC
Oops oops... _ncursesw_ does emit U+2500 and friends, _ncurses_ doesn't.

Forget my previous comment that said "forget my previous comment".
Comment 5 Egmont Koblinger 2017-09-15 11:53:57 UTC
So, boxtest (as seen in the konsole bugreport; except that linked -lncursesw) combined with NCURSES_NO_UTF8_ACS=1 and the brand new xterm-256color description that uses REP:

The corners and vertical lines (the ones that are not repeated in this demo) are emitted as raw UTF-8.

The horizontal lines (the ones that are repeated in this demo) are emitted as the character 'q' after switching to line-drawing mode, and then REPeated.

ncurses/terminfo is insane. It has just further complicated something that's already way too complicated, instead of attempting to simplify the situation.

PuTTY (the one that typically needs NCURSES_NO_UTF8_ACS=1 because it doesn't support line drawing mode in UTF-8, see also bug 787228), if used with TERM=xterm-256color, breaks now because it doesn't support REP either, but even if it implemented REP, it would break in turn on these repeated characters that'd still show up as 'q' (whereas the non-repeated ones would look okay).

Anyway, all of this should be irrelevant to us now. The only thing that matters is we need to implement REP (preferably without the nonsense Latin1 limitation).
Comment 6 Christian Persch 2017-09-15 11:55:20 UTC
(In reply to Egmont Koblinger from comment #1)
> Seems to me that xterm(324) can only repeat characters of Latin-1 as well as
> box drawing ones. Even in UTF-8 mode, this escape sequence is a nop if
> emitted after characters outside of Latin-1.
> 
> If a combining accent is present, it's ignored and only the base character
> is repeated.

Hmm... xterm's doing it wrong, then.

ECMA 48 says:

8.3.103
REP - REPEAT
Notation: (Pn)
Representation: CSI Pn 06/02
Parameter default value: Pn = 1
REP is used to indicate that the preceding character in the data stream, if it is a graphic character
(represented by one or more bit combinations) including SPACE, is to be repeated n times, where n
equals the value of Pn. If the character preceding REP is a control function or part of a control function,
the effect of REP is not defined by this Standard.

and 'graphic character' is defined as

4.2.43 Graphic character
A character, other than a control function, that has a visual representation normally hand-written, printed
or displayed, and that has a coded representation consisting of one or more bit combinations.
Comment 7 Egmont Koblinger 2017-09-15 12:06:50 UTC
Mind you, I'm not testing with the latest ncurses and xterm, but whatever's in Zesty.

_Maybe_ it was fixed recently in xterm to be able to repeat non-Latin1, as well as in ncursesw to keep emitting raw UTF-8 if NCURSES_NO_UTF8_ACS=1.

I doubt it though because the changelog entry

+ add "rep" to xterm-new, available since 1997/01/26 -TD

explicitly refers to REP being available for a looong time, hence safe to rely on (in xterm at least). With such a recent assumed fix in xterm, this ncurses and terminfo change would require a brand new xterm as well, making it referring to a 20-year old feature absolutely pointless.
Comment 8 Egmont Koblinger 2017-09-17 19:41:36 UTC
(In reply to Christian Persch from comment #6)

> [...] the preceding character in the data stream, if
> it is a graphic character
> (represented by one or more bit combinations) including SPACE, is to be
> repeated n times

> [...] If the character preceding REP is a control function
> or part of a control function,
> the effect of REP is not defined by this Standard.

In xterm:

- newline and tab cannot be repeated;
- after a repetition, you cannot repeat again (the second one is a nop).

Of course, these two fall under the second category (not graphic character => behavior is undefined) so xterm is fine here.

PS. In spirit of bug 652124 and perhaps others, we'd better have some safety cap. xterm apparently truncates at 65535 (not just this one, but generally all numeric arguments to CSI or DEC). Or do we maybe already do so?
Comment 9 Egmont Koblinger 2017-09-17 20:42:25 UTC
Created attachment 359948 [details] [review]
Fix v1

First (draft) version.

Properly repeats box drawing characters (in the special box drawing mode, as well as in raw UTF-8), so for all practial uses, this patch is already good enough.

Unlike xterm, it doesn't unset the last character on special nonprintable ones. I don't like it that much, so I might improve on it later, but according to the standard it doesn't matter, it's unspecified.

Does silly things after combining accents.
Comment 10 Egmont Koblinger 2017-09-18 12:44:44 UTC
Created attachment 359969 [details] [review]
Fix v2

Special characters now stop the previous graphic one from repeating (REP is a nop if follows a tab, newline, escape sequences etc.) to match xterm's behavior.

Combining accents are now ignored, the base character is repeated to match xterm. Repeating the entire glyph would require significantly heavier refactorings than the current patch, especially since for locating the base character it's the onscreen contents that matter now, however, for REP it should be the input stream that matters. (On a side note, see also bullet point 3 in bug 673981 which would probably be influenced by such a change). As long as xterm does the same, I'm fine with this behavior.

Nit: The patch removes an empty line from the top of vteseq.cc, accidentally introduced by the big hyperlink patch.

Unless objected, I'll submit in a few days to master & 0-50.
Comment 11 Egmont Koblinger 2017-09-18 12:58:56 UTC
Created attachment 359976 [details]
New xterm description

For convenient testing, here's the newest version (20170916) of the xterm-256color terminfo file.

Try this with an unpatched VTE, and run e.g. alsamixer. The top and bottom rows are immediately faulty. Further, bigger corruptions occur upon horizontal scrolling, or invoking the help (h or F1).
Comment 12 Egmont Koblinger 2017-09-18 21:02:40 UTC
We also have a global cap of 65535 on numeric arguments. It's in table.cc _vte_table_extract_numbers() and goes like CLAMP (..., G_MAXUSHORT).

That being said, there's no current method that repeats something an unlimited number of times, so the patch above is still the simplest, although a bit ugly to clamp twice. I don't feel like introducing a non-limited vte_sequence_handler_multiple() method. :-)
Comment 13 Egmont Koblinger 2017-09-25 11:45:01 UTC
*** Bug 788121 has been marked as a duplicate of this bug. ***
Comment 14 Egmont Koblinger 2017-09-28 14:10:45 UTC
Submitted (with slight cosmetic modifications). Backported to 0-50 and 0-48 too.

Christian, along with the forthcoming 0.50.1, could you please tag a final 0.48.4 as well? I'd like this change to reach distros that are still shipping the 0.48 series (plus there's another unreleased change in that branch).
Comment 15 Egmont Koblinger 2017-10-05 12:03:46 UTC
Just for reference: REP support is now available in VTE versions 0.50.1, 0.48.4 and 0.46.3.