GNOME Bugzilla – Bug 400072
Handling of ; in control sequences
Last modified: 2007-02-23 18:51:31 UTC
Chris committed <http://svn.gnome.org/viewcvs/vte/trunk/src/table.c?rev=1512&r1=1453&r2=1512>: 2007-01-24 Chris Wilson <chris@chris-wilson.co.uk> <mariano> hm, vte is not matching «ESC [ ; 7 m» :/ * src/table.c: (_vte_table_addi), (_vte_table_matchi), (_vte_table_match): s/GList/GSList/ g_slist_append -> g_slist_reverse(g_slist_prepend) And finally add the subtable to handle the leading ';' in the variable length parameters. That patch makes vte ignore a leading ; for a %m parameter (in vte-parlance; that is, a sequence of numbers). Actually, the parameter bytes sequence of a control sequence is to be separated in sub-sequences by ‘;’ charaters, and each sub-sequence that turns out to be empty after doing this (including the initial sub-sequence and the final one) represents a default value, depending on the control sequence. This is ECMA-48 5.4.2. For example, «ESC [ ; 7 m» should be treated as «ESC [ 0 ; 7 m» because 0 is the default parameter for the «ESC [ Ps m» control sequence. Likewise «ESC [ m» stands for «ESC [ 0 m». These control sequence variations do arise in practice: they are (part of) what is wrong with bug 398401. As for the GList->SList optimization, I think that in practice control sequences with many parameters are extremely rare (one could get actual usage data here...), so that a simpler optimization would be to allocate a GPtrArray with, say, g_ptr_array_sized_new (10) and put there the arginfo thingies.
By the way, to see the effect of the patch, say: echo -e 'AAA\e[4mBBB\e[;1mCCC\e[mDDD' both in xterm and in src/vte. It prints 3 A's, starts underlining, then prints three B's, then goes to normal mode and starts using bold, and prints 3 C's, and then goes into normal mode again and prints 3 D's. In vte, the implicit 0 in the '\e[;1m' part is not taken into account.
Created attachment 81052 [details] [review] Add ';' to the numeric class Sorry Mariano, I wanted to ping a patch to you last night, but I needed sleep more. This patch takes a simpler approach and extends '%d' to match '([0-9;]+)*' which is what _vte_table_extract_numbers() was designed to parse.
Hmm, perhaps '([0-9]+;)*' is the closer regexp. PBC.
([0-9]*;?)* I think
Hah, this breaks '\e[1;1H' and I guess others. * bangs head against brick wall.
Created attachment 81078 [details] [review] Create a numeric_list matcher So introduce a numeric_list class (for '%m') search for a match on that subtable before a '%d'. If we fail to find a match, continue on to an exact number match.
Reverted r1512, r1524: 2007-01-24 Chris Wilson <chris@chris-wilson.co.uk> cf Bug 400072 – Handling of ; in control sequences Morale of the story: wait until the morning. Revert r1512, the mistaken attempt at parsing '\e[;30m'. * src/table.c: (_vte_table_addi), (_vte_table_matchi), (_vte_table_match):
Created attachment 81133 [details] [review] Rebase the numeric_list matcher to HEAD
*** Bug 305507 has been marked as a duplicate of this bug. ***
Just found a previous thread, bug 305507, with an almost identical broken patch.
*** Bug 306320 has been marked as a duplicate of this bug. ***
*** Bug 334942 has been marked as a duplicate of this bug. ***
*** Bug 332630 has been marked as a duplicate of this bug. ***
I accidentally commited this patch a while ago and nobody has complained so far - it even fixed a couple of other bugs in the process.
FWIW, the patch has the issue that it does not take into account the fact that the values of default arguments depend on the control sequence (it's always 0 or 1, I guess). Some of this issues are catered for by hand in the source (that's why we have «CSI A» and «CSI %d A», for example).