After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 784306 - Very slow performance comparing CSV files - 25x slower than 2009 Windows tool
Very slow performance comparing CSV files - 25x slower than 2009 Windows tool
Status: RESOLVED OBSOLETE
Product: meld
Classification: Other
Component: filediff
3.17.x
Other Linux
: Normal major
: ---
Assigned To: meld-maint
meld-maint
Depends on:
Blocks:
 
 
Reported: 2017-06-28 19:30 UTC by Dan Dascalescu
Modified: 2017-12-13 19:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Meld doesn't manage to slow inline differences within several minutes (238.55 KB, application/zip)
2017-06-28 19:30 UTC, Dan Dascalescu
Details

Description Dan Dascalescu 2017-06-28 19:30:30 UTC
Created attachment 354644 [details]
Meld doesn't manage to slow inline differences within several minutes

I've migrated from Windows to Linux and one of the last tools that I need a good replacement for is CompareIt (http://www.grigsoft.com/wincmp3.htm). It's a really good old tool that's wicked fast as visually comparing files.

Attached is a stripped down test case of two CSV files that Meld takes ~3 seconds to determine are different before prompting to Keep highlighting. Highlighting inline differences takes more than 5 minutes (the tool was unusable beyond this point, so I quit it), while CPU utilization spikes to 100% of one core and Meld's input thread is slowed down to a crawl.

I wrote "25x" in the subject line based on the full test case, which takes Meld ~25 seconds to prompt to Keep highlighting, while CompareIt displays inline differences in one second.
Comment 1 Kai Willadsen 2017-07-15 02:52:01 UTC
So this basically comes down to: Meld is written in Python, and it's not fast at this sort of thing. You're also hitting worst-case behaviour, because we're just comparing one enormous sequence (i.e., there's no identical lines in that file to break it up).

I spent some time moving our existing comparison algorithm in C/Cython. If you'd like to check out the branch, it's at
    https://github.com/kaiw/meld/tree/cython
You'll need to install Cython for Python 3 and probably some other devel packages to compile the C extension, but otherwise you should be able to run from the cloned directory after a `python3 setup.py build`. If you get it running (or crashing) please let me know.

I don't have firm figures on the speedup, since I also ran out of patience waiting for the comparison to finish, but it now takes about 12 seconds. Realistically, I'm not going to spend significantly more time on making it faster, since I'm pretty happy with this. However, this is a huge change to the way we've done things, adds new build requirements, and I don't know how it's going to go on Windows and OSX. In short, this won't be making it into the 3.18 release series. I hope it'll end up in 3.20.



There's a second bug here, in that the comparison is happening on another thread, and shouldn't be causing our UI to stall. I'm not 100% certain what's going on there, but I suspect Python GIL interactions with... something. This is a regression as far as I'm concerned, but I think I've got a workaround figured out.
Comment 2 Kai Willadsen 2017-07-16 00:53:21 UTC
The UI stalling should now be fixed in current master. This hasn't improved actual comparison performance, just the UI interactivity issue.
Comment 3 GNOME Infrastructure Team 2017-12-13 19:27:19 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/meld/issues/138.