GNOME Bugzilla – Bug 720977
Incomplete UTF-8 preceding newline gets dropped
Last modified: 2014-04-06 18:26:21 UTC
Actual (incomplete UTF-8 immediately preceding a newline gets dropped): $ echo -e '\0303foobar' �foobar $ echo -e '\0303' $ Expected (incomplete UTF-8 immediately preceding a newline should be replaced by the replacement symbol): $ echo -e '\0303foobar' �foobar $ echo -e '\0303' � $
vteconv.c L91: /* Determine why the end of the string is not valid. * We are pur b@stards for running g_utf8_next_char() on an * invalid sequence. */ skip = g_utf8_next_char(*inbuf) - *inbuf; Indeed you are b@stards :) , skip becomes 2 instead of 1.
Created attachment 264880 [details] [review] UTF-8 decoding cleanup This one cleans up the previously found issue. This does not fix the actual bug, though. The bug resides in iso2022.c around "nextctl", vte splits the processing of data at control characters (\n, \r and a couple more), forgetting about incomplete sequences left behind.
Created attachment 264881 [details] [review] UTF-8 decoding cleanup v2
Ah, nice! ChPe, can you commit please?
Created attachment 265388 [details] [review] Fix I don't fully understand the code, but I hope this patch is a proper fix.
Fixed in 0-36, keeping open for vte-next.