After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 602933 - Unicode U+2029 (PARAGRAPH SEPARATOR) causes meld to get out of sync
Unicode U+2029 (PARAGRAPH SEPARATOR) causes meld to get out of sync
Status: RESOLVED FIXED
Product: meld
Classification: Other
Component: filediff
git master
Other Linux
: Normal normal
: ---
Assigned To: Stephen Kennedy
Stephen Kennedy
Depends on:
Blocks:
 
 
Reported: 2009-11-25 13:04 UTC by Antti Kaihola
Modified: 2010-11-23 10:57 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
A file with a U+2029 symbol (38 bytes, text/plain)
2009-11-25 13:04 UTC, Antti Kaihola
Details
A file almost identical to 1.txt but without the U+2029 symbol (53 bytes, text/plain)
2009-11-25 13:05 UTC, Antti Kaihola
Details
screenshot of current incorrect behavior: on the left, line 2 has a trailing U+2029 (27.75 KB, image/png)
2009-11-27 12:49 UTC, Antti Kaihola
Details
mock-up of U+2029 interpreted as a regular newline (27.76 KB, image/png)
2009-11-27 12:50 UTC, Antti Kaihola
Details
mock-up of U+2029 drawn as a symbol without a newline (31.31 KB, image/png)
2009-11-27 12:51 UTC, Antti Kaihola
Details
mock-up of U+2029 drawn as a symbol followed by a regular newline (31.98 KB, image/png)
2009-11-27 12:51 UTC, Antti Kaihola
Details

Description Antti Kaihola 2009-11-25 13:04:39 UTC
Created attachment 148440 [details]
A file with a U+2029 symbol

If a file contains the PARAGRAPH SEPARATOR Unicode symbol U+2029, strange things happen. Wrong lines get marked as mismatches. I'll attach files 1.txt and 2.txt which illustrate the problem if compared with meld.
Comment 1 Antti Kaihola 2009-11-25 13:05:41 UTC
Created attachment 148441 [details]
A file almost identical to 1.txt but without the U+2029 symbol

Compare this to 1.txt with meld
Comment 2 Antti Kaihola 2009-11-25 13:07:54 UTC
The attached file 1.txt contains a U+2029 at the end of line 2 just before the line feed. Meld shows an empty line between lines 2 and 3, but colors the differences as if the empty line didn't exist.
Comment 3 Kai Willadsen 2009-11-25 22:37:53 UTC
What's the expected behaviour here? We could normalise things so that Meld interprets U+2029 as a simple line break (so Meld would just show an extra line inserted between lines 2 and 3). The other option is to treat it as an unknown arbitrary character, which is basically what we're doing now (though maybe we could display it better).
Comment 4 Antti Kaihola 2009-11-27 12:48:44 UTC
(In reply to comment #3)
> What's the expected behaviour here?

I'd prefer some kind of an indicator for a special character to be displayed. It depends on the use case whether it makes sense to additionally insert a line break.

In any case, the highlighting of differences should not get out of sync with the actual content displayed.

I'll attach a screenshot of what meld shows when comparing the two attachments I sent earlier. Notice that:
- on the right, the text "line 2" shouldn't be highlighted
- the empty line on the left should be marked as an extra line
- "line 4" on the left should be marked as an extra line

I'll also attach three mock-ups of different suggested behaviour.
Comment 5 Antti Kaihola 2009-11-27 12:49:50 UTC
Created attachment 148590 [details]
screenshot of current incorrect behavior: on the left, line 2 has a trailing U+2029
Comment 6 Antti Kaihola 2009-11-27 12:50:27 UTC
Created attachment 148591 [details]
mock-up of U+2029 interpreted as a regular newline
Comment 7 Antti Kaihola 2009-11-27 12:51:04 UTC
Created attachment 148592 [details]
mock-up of U+2029 drawn as a symbol without a newline
Comment 8 Antti Kaihola 2009-11-27 12:51:56 UTC
Created attachment 148593 [details]
mock-up of U+2029 drawn as a symbol followed by a regular newline
Comment 9 Kai Willadsen 2010-10-26 21:12:02 UTC
Could you please test the patch I've just attached to bug 627940? It should fix the problem, though it doesn't actually do any nice display of the different linebreak.
Comment 10 Kai Willadsen 2010-11-23 09:31:14 UTC
I've pushed that patch to head, so closing this bug. I've opened bug 635593 about taking into account newlines and showing newline differences.

Thanks for your bug report.
Comment 11 Antti Kaihola 2010-11-23 10:57:29 UTC
I just checked out the git repository which already contains Kai's patch, and it works beautifully.