GNOME Bugzilla – Bug 754936
Regex matching ignores offscreen contents
Last modified: 2021-06-10 15:05:12 UTC
Regex matching correctly recognizes URLs (or whatever it wants to recognize) if they are split across multiple lines -- but only if both lines are currently onscreen. It should also recognize the parts that are scrolled out at the moment. Example: Press <Enter> many times (so that the terminal starts scrolling), then type <bash prompt> $ http://example.com/some/very/long/url/that/will/very/soon/wra p/to/a/new/line Scroll back one line (Shift+Ctrl+Up) so that only the beginning of the URL is seen. Ctrl+Click on this URL. Expected behavior: Opens the complete long URL. Actual behavior: Opens the truncated (hence broken) short URL. The same happens at the top of the current viewport too. You can't test it with URLs as they need a fixed prefix; but you can test with email addresses, or numbers (with the patch from bug 741728).
Created attachment 311231 [details] [review] Demo patch vte_terminal_match_contents_refresh() calls vte_terminal_get_text() which basically takes a "screenshot", stores it in match_contents and match_attributes, and then the rest of the regex matching operates on this data. A quick-n-dirty hack could be to instead call vte_terminal_get_text_range() with a couple of more lines included in both directions (and accept that extremely long URLs that span over more than these many lines are still treated incorrectly). I'm not super happy with this approach, but it could have the best value for price ratio.
Created attachment 315090 [details] [review] Demo patch, v2 Update to go on top of smooth scrolling.
*** Bug 406134 has been marked as a duplicate of this bug. ***
Review of attachment 315090 [details] [review]: ::: src/vte.cc @@ +1219,1 @@ + start_row = _vte_terminal_first_displayed_row (terminal) - context_lines; Gotta double check what happens if this becomes negative, or safeguard with a MAX(0, ...).
Geez, is this still not fixed?? Christian, is it okay for you if I submit the quick'n'dirty fix with an arbitrarily chosen (let's say 3) context lines for 0.50?
Wouldn't just 1 line be enough in the usual case?
Coming up with arbitrary constants is always hard :D I was actually thinking of increasing to maybe 5 so that longer URLs work even more often. I really don't know :)
In the mean time I'm wondering if we should adjust the ring size calculation, so that (assuming you haven't scrolled back the viewport) with this lookbehind we still don't need to read from frozen rows (access the disk etc.). Which, if we do so, should probably wait till the 0-51 cycle. I'd rather not touch such core components so last minute, even if it's just an off-by-n.
Created attachment 358467 [details] [review] Demo patch, v3 Updated to the current codebase, plus increased the ring size. The latter one still needs thorough verification and testing, e.g. I have to double check again the exact relation of ring->max and ring->mask.
Oops, forget this patch, it causes segfaults.
Created attachment 358470 [details] [review] Demo patch, v4 Stupid me, ring->max is something entirely different. Hope this version is okay.
Created attachment 358471 [details] [review] Demo patch, v5 Time for me to slow down and take a break I guess.
Comment on attachment 358471 [details] [review] Demo patch, v5 Instead of extracting a few (fixed, arbitrary) lines of context, regex matching should be ported to use our new RingView infrastructure.
Definitely. Also can provide the infrastructor to fix https://gitlab.gnome.org/GNOME/vte/issues/117 and https://bugzilla.gnome.org/show_bug.cgi?id=788673 at the same time.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/vte/-/issues/2221.