GNOME Bugzilla – Bug 791456
Ignore additional truecolor parameters
Last modified: 2017-12-14 22:17:29 UTC
For RGB escape sequences, ITU Recommendation T.416 talks about further parameters: [...] the parameter elements 3, 4, and 5, are three integers for red, green, and blue [...] [...] Parameter 6 has no meaning. [...] parameter element 7 may be used to specify a tolerance value [...] [...] parameter element 8 may be used to specify a colour space associated with the tolerance [...] When semicolon is used as separator, even xterm stops parsing after the 5th parameter, e.g. "38;5;1;2;3;48;5;4;5;6" becomes foreground of RGB (1,2,3) and background becomes RGB (4,5,6). For compatibility with who-knows-what (including our perf/img.sh script), I wouldn't touch this. When colon is used as separator, though, currently VTE rejects the entire sequence if more than 5 parameters are present. It should instead silently ignore them, as per this specs, and as per xterm's behavior too. And even though this doc doesn't say anything about 256-colors, I guess we should ignore additional parameters there too (and so does xterm). --- By the way, the aforementioned doc only mentions colons as separator here, it doesn't allow semicolon, so stopping parsing after the 5th argument in case of semicolon would be absolutely fine I guess. (Also, interestingly, the doc doesn't state or implicitly refer to parameter 6 being optional. And not a single word about value range for R, G, B and these tolerance stuff.)
Created attachment 365333 [details] [review] Fix (I also sneaked in a fix to something that looks like a potential overrun.)
Typo: RGB escapes go like "38;_2_;1;2;3;48;_2_;4;5;6".
What a coincidence; I was also recently looking at T.416 for the colour sequences :-) I may be mis-reading the spec, but it would seem to me that vte doesn't implement these 38:2:r:g:b correctly (same for 48, of course). The spec says: """ A parameter substring for values 38 or 48 may be divided by one or more separators (03/10) into parameter elements, denoted as Pe. The format of such a parameter sub-string is indicated as: Pe : P ... [...] The first parameter element indicates a choice between: [...] 2 direct colour in RGB space; [...] If the first parameter element has the value 2, 3, or 4, the second parameter element specifies a colour space identifier referring to a colour space definition in the document profile. If the first parameter element has the value 2, the parameter elements 3, 4, and 5, are three integers for red, green, and blue colour components. Parameter 6 has no meaning. """ So the "38" or "48" isn't the "first" parameter; the "2" for "RGB" is the first parameter. And then there's a 2nd parameter; the RGB values are 3, 4, 5. So in keeping with the spec, these sequences really should be 38:2::r:g:b with an empty (or, if nonempty, ignored) param at position 2. (In reply to Egmont Koblinger from comment #0) > By the way, the aforementioned doc only mentions colons as separator here, > it doesn't allow semicolon, so stopping parsing after the 5th argument in > case of semicolon would be absolutely fine I guess. (Yes. We allow the use of the 38;2;r;g;b and 38;2:r:g:b form for these for legacy reasons only.) > (Also, interestingly, the doc doesn't state or implicitly refer to parameter > 6 being optional. And not a single word about value range for R, G, B and > these tolerance stuff.) Well, there's this note on page 41 concerning that 2nd parameter I cited above: """ NOTE 3 – The “colour space id” component will refer to the applicable colour space description in the document profile which may contain colour scaling data that describe the scale and offset to be applied to the specified colour components in the character content. Appropriate use of scaling and offsets may be required to map all colour values required into the integer encoding space provided. """ and T.414 7.3.10.4 seems to contain the relevant information on that 'colour space'. --- Anyway, I do agree that we should just ignore extra parameters after the RGB values, instead of ignoring the sequence. Or we could use the length (exactly 3 subparams after the 38:2) to distinguish the 38:2:r:g:b form from the apparently correct, full 38:2:id:r:g:b[:...] form.
> Created attachment 365333 [details] [review] [review] > (I also sneaked in a fix to something that looks like a potential overrun.) Hmm these 2 chunks don't seem right to me? The 'unchecked' variants are to be used when you've previously ensured that there are enough parameters ('unchecked' is referring to the index). Quite possibly the naming could be improved?
(In reply to Christian Persch from comment #3) > So the "38" or "48" isn't the "first" parameter; the "2" for "RGB" is the > first parameter. And then there's a 2nd parameter; the RGB values are 3, 4, > 5. > > So in keeping with the spec, these sequences really should be 38:2::r:g:b > with an empty (or, if nonempty, ignored) param at position 2. Oh, crap!!! Now that every single terminal emulator that supports true colors implements it differently (shifted by one), even including our reference xterm whose author apparently truly dislikes this feature but at least parses and rounds to the nearest 256-color... not sure if we should adhere to the specs or to existing implementations. > Or we could use the length (exactly 3 subparams after the 38:2) to > distinguish the 38:2:r:g:b form from the apparently correct, full > 38:2:id:r:g:b[:...] form. Yup we could do that (which would again be strictly speaking different from xterm's, but wouldn't cause problems yet).
(In reply to Christian Persch from comment #4) > Hmm these 2 chunks don't seem right to me? The 'unchecked' variants are to > be used when you've previously ensured that there are enough parameters > ('unchecked' is referring to the index). Quite possibly the naming could be > improved? Thx; I even looked up their definitions but indeed swapped the two in my mind. Apparently this method is fine then, but I'd still rewrite so that the '2' and '5' cases look consistent. IMO nicely looking source code is more important than such micro-optimizations :-)
(In reply to Christian Persch from comment #3) > [...] the second > parameter element specifies a colour space identifier This was apparently mentioned at https://gist.github.com/XVilka/8346728#gistcomment-1648133 but no one seems to have paid attention.
(In reply to Egmont Koblinger from comment #6) > Apparently this method is fine then, but I'd still rewrite so that the '2' > and '5' cases look consistent. IMO nicely looking source code is more > important than such micro-optimizations :-) Well, SGR is likely the most-used sequence, so anything that avoids branches in there shave a few ns off it :-)
The same guy also comments that: "ITU T.416-199303" (the only 'standard' that uses this for it's colour sequences) was never publicly published ISO/IEC 8613-6 is apparently the very same document (at least the publicly available first few chapters seem to be identical at first glimpse), although with this name it's only available for CHF 178, and I truly don't feel like paying that much for it :D Who knows, maybe it differs from T.416 when it comes to true colors?? And honestly, I have no clue who is who in this story of writing specs, plus it's also unclear to me whether it's really a "spec" or just a "recommendation", or a draft, or what.
(In reply to Christian Persch from comment #8) > Well, SGR is likely the most-used sequence, so anything that avoids branches > in there shave a few ns off it :-) I'd argue that this one here is still a pointless micro-optimization. But I'll follow the "if it ain't broke, don't fix it" principle. :)
(In reply to Egmont Koblinger from comment #9) > ISO/IEC 8613-6 is apparently the very same document (at least the publicly > available first few chapters seem to be identical at first glimpse), > although with this name it's only available for CHF 178, and I truly don't > feel like paying that much for it :D Who knows, maybe it differs from T.416 > when it comes to true colors?? ISO 8613-6 is the same as T.416; it says so in the Introduction (page Ⅴ) of the T.416:1993 PDF.
Created attachment 365335 [details] [review] Fix, v2 I think this should work. Will need to add comments.
With this change, we're about to have 5 different ways of achieving true color: 1. semicolon: ^[[38;2;rrr;ggg;bbbm 2. mixed: ^[[38;2:rrr:ggg:bbbm 3. mixed with params: ^[[38;2:[color_space_id]:rrr:ggg:bbb[:more_params]m 4. colon: ^[[38:2:rrr:ggg:bbbm 5. colon with params: ^[[38:2:[color_space_id]:rrr:ggg:bbb[:more_params]m This is just insane. I haven't seen anyone using the mixed notation (2 and 3) ever, I wouldn't for a second mind dropping it. Colon is also extremely rare since plenty of terminal emulators don't support them, which makes applications decide to go the easy way and use semicolons. See e.g. https://github.com/neovim/neovim/issues/7473 https://github.com/kovidgoyal/kitty/issues/160 https://github.com/mobile-shell/mosh/pull/939 A recent VTE memory leak discovered also suggests that not many people use it, and it implicitly turns out from the thread that nvim reverted to semicolons somewhere between 0.2.1 and 0.2.2: https://github.com/neovim/neovim/issues/7573 https://bugzilla.gnome.org/show_bug.cgi?id=790539 So I would also risk dropping (4). This might break a few things here and there (probably people's custom color setups). Worst case we revert and bring it back. This would leave us with (1) which is used in practice, and (5) which is according to the specs. It's not really about a win in code size. It's a little bit about making it easier to test all the possibilities, but mostly about pushing the world towards a much cleaner state. Before moving forward, however, I believe we should investigate a little bit where the 38;2;rrr;ggg;bbb (without color_space_id) format first appeared (maybe konsole is a good starting point), to see whether they indeed misread the specs, or maybe they worked from some other, contradicting documentation.
(In reply to Egmont Koblinger from comment #13) > Before moving forward, however, I believe we should investigate a little bit > where the 38;2;rrr;ggg;bbb (without color_space_id) format first appeared > (maybe konsole is a good starting point), to see whether they indeed misread > the specs, or maybe they worked from some other, contradicting documentation. https://bugs.kde.org/show_bug.cgi?id=107487 added this to konsole, already in the 38;2;r;g;b format (konsole doesn't seem to support the sequences using colon at all?); and they mentioned that these sequences came from xterm already. It seems this was added in xterm 282 (accepted these sequences, but only matches to closest palette entry, not providing true truecolour).
> (In reply to Egmont Koblinger from comment #13) > (konsole doesn't seem to support the sequences using colon at all?) Yup, apparently they don't support it.
Filed iTerm2 bug: https://gitlab.com/gnachman/iterm2/issues/6377.
Kitty bug: https://github.com/kovidgoyal/kitty/issues/227. Its author decided to keep the misinterpreted colon-based sequence (4) [as per the bullet points of comment 13] as well. Not sure if it supports the mixed (2, 3) ones.
Chrsitian, let's come to a decision, and then I'll quickly implement it. (1) is needed for compatibility with existing truecolor libs/apps. (5) is needed for standard compliance. For each of (2), (3), (4), there's three choices to choose from: - keep it - drop it (we can still bring back later if really needed) - enable only for stable VTE releases I cast a weak vote for the latter. This would hopefully give us the most feedback on things we break, in a way that we don't piss off users. We can drop them fully in a year or two if there are no complaints. If you have a stronger opinion, I'm happy to follow that.
I think only 1, 4 and 5 are required. I've checked debian codesearch, and the incorrect 38;2;r;g;b form is most widespread, with only a few hits for the 38:2:r:g:b form, and nothing at all for mixed 38;2:r:g:b form. IMHO we can just drop support for the mixed forms, in both stable and unstable.
(In reply to Christian Persch from comment #19) > IMHO we can just drop support for the mixed forms, in both stable and > unstable. Okay, I'm happy to. What about (4)? Shall we live with it, or in some way push towards deprecation (e.g. support only in stable releases, or emit a warning to stderr (once per vte))?
At least on debian codesearch, the only users of the 38:2:r:g:b form are vim and nvim, so deprecating this form should be ok. So that means we could continue to accept it for some time (stable and unstable). Not sure the msg to stderr would be useful, since g-t stderr goes to journal to be ignored...
nvim reverted to semicolon in version 0.2.2. No clue about vim. Okay, let's say that (4) is still fully supported, and perhaps get back to deprecating it in a couple of years. I'll do these changes, and add a cmdline flag to img/perf.sh to choose from the three supported modes.
(In reply to Christian Persch from comment #14) > https://bugs.kde.org/show_bug.cgi?id=107487 added this to konsole, already > in the 38;2;r;g;b format (konsole doesn't seem to support the sequences > using colon at all?); and they mentioned that these sequences came from > xterm already. It seems this was added in xterm 282 (accepted these > sequences, but only matches to closest palette entry, not providing true > truecolour). xterm's ctlseqs.txt says: "ISO-8613-3 can be interpreted in more than one way; xterm allows the semicolons in this control to be replaced by colons (but after the first colon, colons must be used)." "These ISO-8613-3 controls are: Pm = 3 8 ; 2 ; Pr; Pg; Pb -> Set foreground color to the closest match in xterm's palette for the given RGB Pr/Pg/Pb." ISO-8613-3 is the same as ITU-T Recommendation T.413 (available for free under this name) and is an utterly irrelevant document. So I assume he meant ISO-8613-6. Given that both sentences explicitly mention ISO-8613-whatever, I am now also pretty confident that xterm misinterpreted the standard, rather than worked from another contradicting source of information.
Created attachment 365547 [details] [review] Fix, v3
Created attachment 365559 [details] [review] Fix, v4 Fix, v4 Added cmdline options to 256test.sh too. Note that img.sh and 256test.sh now default to the "brand new" (official) format, that is, produc-es broken result in many terminal emulators (including img.sh in VTE prior to this patch). This is deliberate, I'd like to push the world towards accepting the official sequences.
Comment on attachment 365559 [details] [review] Fix, v4 Ok :-)
Submitted, closing.