GNOME Bugzilla – Bug 632540
meld refuses comparing files with NUL symbols
Last modified: 2017-10-20 22:49:17 UTC
Hello. Meld refuses to compare files containing NUL symbols claiming them binary: http://git.gnome.org/browse/meld/tree/meld/filediff.py#n813 if nextbit.find("\x00") != -1: t.buf.delete(*t.buf.get_bounds()) add_dismissable_msg(t.pane, gtk.STOCK_DIALOG_ERROR, _("Could not read file"), _("%s appears to be a binary file.") % t.filename) Commenting out these lines doesn't help, however, and the comparison result becomes a mess. I often have to compare text files which have a few lines with NULs embedded, and it's very inconvenient to filter them out before Meld'ing. Thanks.
I would expect that commenting out the lines *should* fix it, but only if you have the files' text encoding listed in Preferences -> Encoding. I'm guessing that these files are UTF-16; that's the most common problematic text format that has embedded null bytes. The way I'd think about fixing this would be to: * Keep the message (since it's usually correct) * Add a button to the message saying 'Open anyway' If the user selects 'open anyway', then we proceed to load the file, and try out all of our encoding options as we do normally.
Those files aren't UTF-16, they are Latin-1 with just one or two embedded NULs. So both variants are not really good. First, you *may* show the message but it shouldn't be modal, and the user should be able to disable it completely. Second, if user wants to compare files for some reason, he or she really wants that, otherwise user didn't even try to open. Extra confirmation is redundant and annoying. So the proper way of fixing would be implementing the correct way of dealing with any symbols in the file. Thanks.
*** Bug 677237 has been marked as a duplicate of this bug. ***
Since this is a bug report mentioning UTF-16 encoding support being problematic: is there a way to use meld with such files? I tried adding utf16 utf-16 to the encoding in preferences, but that did not help.
Not currently, no. The issue is that we have a binary file check that is independent of the encoding check. Even if the encoder claims that it can handle a file, if it looks like it's binary (i.e., contains embedded nulls) then Meld will refuse to open it. There are good reasons for this (e.g., latin1). I think the easiest workaround here would be to look at adding an 'open it anyway' button to the message notification that appears, but I haven't looked at doing so.
Turns out this changed when we moved to using GtkSourceView loading. We just don't do this check any more, so this is "fixed" in the sense that it will do a bad thing, but hey that's what's being asked for. As for UTF-16, that should "just work" with the current build, but if not then please feel free to open a new bug.