Bug 578613 – Conflict solver mode (auto merge with ancestor)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 578613 - Conflict solver mode (auto merge with ancestor)


Summary:	Conflict solver mode (auto merge with ancestor)


Status:	RESOLVED FIXED

Product:	meld
Classification:	Other
Component:	filediff
Version:	git master
Hardware:	Other All

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	Stephen Kennedy
QA Contact:	Stephen Kennedy

URL:
Whiteboard:

Duplicates:	169233 (view as bug list)
Depends on:
Blocks:

Reported:	2009-04-10 15:36 UTC by Piotr Piastucki
Modified:	2012-11-09 09:33 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Conflict solver mode patch (35.91 KB, patch) 2009-04-10 15:39 UTC, Piotr Piastucki	none	Details \| Review
2nd version auto merge with bug fixes and improvements (39.29 KB, patch) 2009-04-14 13:46 UTC, Piotr Piastucki	none	Details \| Review
auto merge v.3 (39.57 KB, patch) 2009-04-16 10:50 UTC, Piotr Piastucki	none	Details \| Review
auto merge v.3 patch re-based to ref f995505fcfcea8815dd106acbe0630f6e1aed50f after repository migration to git. (38.17 KB, patch) 2009-04-19 06:33 UTC, Piotr Piastucki	none	Details \| Review
auto merge v.4 (38.48 KB, patch) 2009-04-21 15:49 UTC, Piotr Piastucki	none	Details \| Review
auto merge v.5 (37.83 KB, patch) 2009-05-06 07:47 UTC, Piotr Piastucki	none	Details \| Review
auto merge v6 (31.30 KB, patch) 2009-07-01 07:23 UTC, Piotr Piastucki	none	Details \| Review
auto merge v7 (39.90 KB, patch) 2009-07-05 15:51 UTC, Piotr Piastucki	none	Details \| Review
auto merge v8 (40.18 KB, patch) 2009-07-06 19:35 UTC, Piotr Piastucki	none	Details \| Review
auto merge v8 split into smaller patches (10.41 KB, application/x-gzip) 2009-07-06 19:36 UTC, Piotr Piastucki		Details
Fixed MyersMatcher optimizations (23.87 KB, patch) 2009-07-07 16:10 UTC, Piotr Piastucki	none	Details \| Review
Rebased and slightly modified version of the v8 patchset (10.13 KB, application/x-gzip) 2009-07-08 02:12 UTC, Kai Willadsen		Details
Slightly modified v8.1 (bugfixes + cleanup) (10.71 KB, application/x-gzip) 2009-07-17 18:14 UTC, Piotr Piastucki		Details
Auto merge v9 (8.83 KB, application/x-gzip) 2009-07-18 13:36 UTC, Piotr Piastucki		Details
Rebased patchset with improved matcher (9.61 KB, application/x-gzip) 2009-11-03 15:52 UTC, Piotr Piastucki		Details
Rebased patchset with matcher improvements (11.14 KB, application/x-gzip) 2009-11-04 14:15 UTC, Piotr Piastucki		Details
Changes in integration with FileDiff (8.57 KB, application/x-gzip) 2009-11-06 07:26 UTC, Piotr Piastucki		Details
Fixed patch (9.08 KB, application/x-gzip) 2009-11-09 11:33 UTC, Piotr Piastucki		Details
Updated patch set (8.36 KB, application/x-bzip) 2009-11-21 23:35 UTC, Kai Willadsen		Details
Updated patch with more elaborated commit message (7.38 KB, application/x-bzip) 2009-11-25 21:46 UTC, Piotr Piastucki		Details
Add call to Differ_auto_merge from Merger (2.32 KB, patch) 2009-12-14 08:48 UTC, Piotr Piastucki	none	Details \| Review
Refactor auto merge code (7.30 KB, patch) 2010-03-12 12:27 UTC, Piotr Piastucki	none	Details \| Review
Corrected refactoring patch (7.30 KB, patch) 2010-03-13 22:32 UTC, Piotr Piastucki	none	Details \| Review
Two minor patches addressing issues 3 and 4 above (1.21 KB, application/x-compressed-tar) 2010-03-22 07:35 UTC, Kai Willadsen		Details

Description Piotr Piastucki 2009-04-10 15:36:41 UTC

This patch adds conflict solver mode to meld.
The patch includes 2 new modules:
matchers.py - alternative matchers: 
1) MyersMatcher (seems to be the most suitable matcher for merging/conflict solving)
2) LCSMatcher (well, pretty useless :))
merge.py - enhanced Differ class that handles some conflicts automatically

The patch also addresses some issues I have spotted in the code (see tickets #578303, #577092).

There are as few UI-related changes as possible: only the middle pane is editable when running in conflict solver mode and icons and mouse actions are handled accordingly.

In order to start meld in conflict solver mode it needs to be run with 4 file parameters as follows:
meld myfile basefile theirfile mergedfile

This is the initial version, but if such a functionality is welcome I am eager to participate in further development.

Comment 1 Piotr Piastucki 2009-04-10 15:39:07 UTC

Created attachment 132475 [details] [review]
Conflict solver mode patch

Comment 2 Stephen Kennedy 2009-04-11 14:11:30 UTC

Hi Piotr, that's a feature several people have been asking for and new contributors are always welcome. BTW we're gearing up for a release soon so if you want early feedback, join the mailing list and post a link to here.

Comment 3 Piotr Piastucki 2009-04-14 13:46:55 UTC

Created attachment 132641 [details] [review]
2nd version auto merge with bug fixes and improvements

This is the 2nd version of the auto merge patch. 
What's new:
- bug fixes, performance improvements and code clean-up
- proper support for text filters
- indicator of remaining unresolved conflicts (statusbar)
- option to use additional text filters when for conflicting chunks (e.g. ignore whitespace changes)

Comment 4 Piotr Piastucki 2009-04-16 10:50:03 UTC

Created attachment 132749 [details] [review]
auto merge v.3

A couple of additional bug fixes and improvements:
- fixed mouse actions
- fix in Merger class
- improved conflict size calculations

The patch is now fully functional IMHO.

Comment 5 Piotr Piastucki 2009-04-19 06:33:36 UTC

Created attachment 132894 [details] [review]
auto merge v.3 patch re-based to ref f995505fcfcea8815dd106acbe0630f6e1aed50f after repository migration to git.

Comment 6 Nathan Baker 2009-04-19 20:27:42 UTC

How soon will this code be available in a release? I'd like to test it and I'm willing to grab the source from git if necessary, but since I use meld for work then I prefer to run stable bits when possible :)

Comment 7 Piotr Piastucki 2009-04-21 15:49:06 UTC

Created attachment 133051 [details] [review]
auto merge v.4

2 changes:
1) mouse actions are fixed
2) the way conflicting lines are shown (marker + original text)

Comment 8 Piotr Piastucki 2009-05-06 07:47:34 UTC

Created attachment 134085 [details] [review]
auto merge v.5

Changes:
- refactoring and clean-up
- smarter conflict solver
- Myers matcher performance improvements

Comment 9 Stephen Kennedy 2009-05-09 22:38:01 UTC

Hi Piotr, thanks for the continuing patches. I have a few questions.
Is the myers matcher much better/faster than the python difflib one? Or
is there some other reason it's included?

For the "merge" feature, don't we want to be able to auto merge at any
time there is a 3 way merge open? i.e. just hook the operation up to a
button. I'd aim to have a non modal interface where possible. I notice
for instance merge-mode makes two panes non editable.

How should the "4 file" (yours,base,mine,output) merge work? Should the
user have to explicitly confirm that the merge was a) complete and
b) successful when the tab is closed? I think currently tools currently
regard any write to the base file as "success" which is not great.

Comment 10 Piotr Piastucki 2009-05-10 17:16:53 UTC

Hi Stephen,

Quite a lot of questions, but I'll try to address all of them :)

1) Myers diff. This is probably the best known diff algorithm and it is commonly used in almost all VCS. It is much slower than difflib.SequenceMatcher algorithm, but it is guaranteed to return the shortest edit script (or the longest common sequence, which means exactly the same) while SequenceMatcher does not guarantee anything but the fact that the diff will look nice in some cases. See http://piastucki.bdl.pl/matchers.html for comparison. Original paper by Myers: http://www.xmailserver.org/diff2.pdf. (BTW: nice reading about diff optimizations http://neil.fraser.name/writing/diff/)

2) 4 files. Actually, auto merge requires 3 source files and the fourth file name (name only!) is used as the name of the merged file.

My implementation is based on the assumption that this mode is used to resolve conflicts reported by SVN, CVS or similar VCS. In this case there is no point in changing the source files, because they are just kind of "links":
base - ancestor (some revision in the past in repository, cannot be changed)
theirs - current version in the repository (cannot be changed either)
mine - modified local copy before the conflict occurred (changing this file does not make any sense either as we can modify the merged one instead)

In case of SVN the files are named as follows (NNNN - some revision number):
mine - filename.ext.working
base - filename.ext.merge-left.rNNNNN
theirs - filename.ext.merge-right.rNNNNN
merged - filename.ext

That's why both panes are disabled.

This is exactly my use case and I was desperately searching for a decent tool I could use to resolve SVN conflicts without the need to run TortoiseMerge under Wine. So far I managed to complete 3 simple (up to 10 conflicting files, up to 100 conflicting lines in a file) merges with meld :)

3) Auto merge in 3-way diff mode. If auto merge is invoked to resolve conflicts from 3-way diff then all panes should probably remain editable as they represent physical files in different folders for instance. Adding a "magic wand" button that will perform auto merge on demand in 3-way diff mode sounds reasonable too.

4) The way auto merge works.
Technically speaking: we compute 3-way diff between base, mine and theirs. Then we apply the changes to the base file (all non-conflicting and all conflicting that we can resolve automatically). Finally, we compute 3-way diff between merged (base + changes), mine and theirs.

There is a simple counter of unresolved conflicts in the statusbar. When the counter reaches 0 the user should be able to save the middle file, otherwise there should be a warning asking the user to confirm if the file with unresolved conflicts should really be saved. Moreover, there may be a "Resolve" button that will be enabled as soon as the counter reaches 0.

5) Missing features. I think there are at least 2 missing features: integration with VCS plugins (dbl click on conflicting file in VCS view should run auto merge) and buttons to jump to the next/previous conflict.

Cheers,
Piotr

Comment 11 Stephen Kennedy 2009-05-13 21:37:54 UTC

1) Ah thats interesting. I'm surprised that the Myers diff is slower because
thats the algoritm that gnu diff uses (albeit with major tweaks). And according
to http://selenic.com/pipermail/mercurial/2005-December/006065.html it's still
faster than a c implementation of the difflib algorithm (bdiff)

The one thing which attracted me about the Myers algorithm is that it looks
more friendly for incremental update. I've not seen any papers on it, but it
looks like a "greedy local snake update" algorithm would handle the most common
case for meld which is updating while interactively editing.

For this patch, diffutil already has logic to locate conflicts. Isn't it
enough to go through the list given by _merge_diffs in reverse order?


2) I really like the idea of providing an interface to explicitly confirm that
a conflicts have been resolved. One thing about the editability though -
sometimes a conflict will require manual editing which won't match either
theirs or mine. Won't that leave a conflict marker? I often apply the merged
changes into theirs or mine as a reminder to self that the conflict has been
solved.


One other implementation issue is that it would be great if to reduce the
number of "if merge_mode" blocks to the absolute minimum. It might be easier
to have a merge_output member instead of having to check len(files) and
merge_mode.

Comment 12 Piotr Piastucki 2009-05-14 05:40:42 UTC

Stephen,

>For this patch, diffutil already has logic to locate conflicts. Isn't it
>enough to go through the list given by _merge_diffs in reverse order?

IMHO, it is not good enough. As mentioned before the edit script generated by difflib.SequenceMatcher is far from optimal in many cases. As a result the number of changes is higher than it needs to be (i.e. the number of changes returned by Myers diff). Obviously it results in more conflicts. I'd rather wait e.g. 200ms more when diff is computed than solve these additional conflicts manually.

>Ah thats interesting. I'm surprised that the Myers diff is slower because
>thats the algoritm that gnu diff uses

Yes, there are some tweaks in gnu implementation. It may not yield the shortest edit script because of them sometimes, though it happens very rarely. For me the performance of Myers diff implementation included in this patch is satisfactory, but I can consider porting some of these tweaks from gnu diff if it is too slow. 
I cannot see any way Myers diff can be faster than difflib algorithm apart from situations when the two files are almost the same. If the number of differences is high Myers diff will need to perform much more iterations and string comparisons to find the middle snake. And if the files are completely different there is no way we can significantly speed up Myers diff.

>One thing about the editability though - sometimes a conflict will require  
>manual...

That's why I introduced the indicator or unresolved conflicts (it actually counts lines of code). The number of unresolved conflicts shown by this patch includes only the conflicts that could not be resolved automatically when the merged file was created. You're right that if there is an unresolved conflict that is replaced by a piece of code which does not match either theirs and mine it will be marked as a conflict, however, the number of unresolved conflicts will be lowered at the same time.

> if merge_mode

Yes, I fully agree this code may need some cleanup.

Comment 13 Piotr Piastucki 2009-05-14 07:25:29 UTC

See http://piastucki.bdl.pl/matchers2.html on how to fool difflib easily :)

>I've not seen any papers on it, but it
>looks like a "greedy local snake update" algorithm would handle

Currently I store the list of snakes locally inside a method, but the middle snake may be made turned into a class field and updated on text changes instead of recomputing the diff.

Comment 14 Piotr Piastucki 2009-07-01 07:23:10 UTC

Created attachment 137671 [details] [review]
auto merge v6

I have been testing and improving this patch for a couple of weeks and here are the results :) The auto merge logic is much smarter and more reliable then before. More info regarding the algorithm can be found here: http://piastucki.bdl.pl/meld/automerge.html

Regards,
Piotr

Comment 15 Piotr Piastucki 2009-07-05 15:51:32 UTC

Created attachment 137880 [details] [review]
auto merge v7

Another bunch of changes and improvements:
1) significant performance optimizations in MyersMatcher (pre-processing)
2) as suggested by Stephen, merge_mode flag has been removed and a new class called FileMerge has been added. A couple of methods have been introduced in FileDiff by extracting the existing code to allow FileMerge to override them:
- _get_custom_status_text()
- _get_custom_textbuffer()
- _after_files_loaded()
- _linkmap_draw_icon()
- _linkmap_process_event()

Comment 16 Kai Willadsen 2009-07-05 22:55:13 UTC

This feature is obviously great, but the patch is *huge* and touches lots of files. It would be a whole lot easier to merge this if the patch was split up and had commit messages. There are quite a lot of changes that appear unrelated to auto-merge itself. Some of these changes look fine and can probably be merged straight away, whereas other bits, like the status bar changes, are IMO just taping over existing problems and shouldn't go in in their current state.

As a start, I'd be happy to merge most of the diffutil.py changes if they were accompanied by real commit messages. Just commenting on the diffutil.py parts of the patch for the moment...

+    def _get_matcher(self, a, b):
+        return IncrementalSequenceMatcher(None, a, b)
+

The matcher class should probably be a class member variable, not an instance method.

+    def _auto_merge(self, using, texts):
+        l0, h0, l1, h1, l2, h2 = self._merge_blocks( using )
+        if h0-l0 == h2-l2 and texts[0][l0:h0] == texts[2][l2:h2]:
+            if l1 != h1 and l0 == h0:
+                tag = "delete"

This looks like a bugfix. If it is, then what does it fix, and can it be committed separately?

+        out0 = (tag, l1, h1, l0, h0)
+        out1 = (tag, l1, h1, l2, h2)
+        yield out0, out1

Why is _auto_merge a generator?

+                if seq0[0][LO] == seq1[0][LO]:
+                    if seq0[0][0] == 'insert':
+                        high_seq = 0
+                    elif seq1[0][0] == 'insert':
+                        high_seq = 1

Another bugfix? Also, the rest of the file uses double quotes for string literals.

+                if high_mark == other_diff[LO] and (high_diff[0] != 'insert' or other_diff[0] != 'insert'):
+                    break

Unless I've messed something up,
    if high_mark == other_diff[LO] and not (high_diff[0] == other_diff[0] == "insert"):
looks more natural to me. Is this another bugfix? Also, double quotes.

Unrelatedly, the webpage explaining the auto-merge stuff is really nice. When this feature goes in, I'd suggest that maybe that page could live somewhere under http://live.gnome.org/Meld.

Comment 17 Piotr Piastucki 2009-07-06 19:35:14 UTC

Created attachment 137932 [details] [review]
auto merge v8

No new functionality, just some refactoring.

Comment 18 Piotr Piastucki 2009-07-06 19:36:30 UTC

Created attachment 137933 [details]
auto merge v8 split into smaller patches

Comment 19 Piotr Piastucki 2009-07-06 19:39:32 UTC

Kai,

As you wish :)
Here is the patch divided into 8 smaller patches. Some explanation can be found below or (with some graphs) at http://piastucki.bdl.pl/meld/automerge_patches.html

BTW: I am not a Python developer (90% Java, 10% C), so do not hesitate to change anything in my code if you find it awkward.

0001-Improve-conflict-detection-for-insert-chunks.patch
Fix conflict detection between insert hunks and other hunks. A conflict was reported when replace or delete hunk started at the same position as insert. As for the change in FileDiff class, I do not see any reason why this code was there. Hunks should be highlighted according to the results returned by Differ as it is too late to change anything when it comes to drawing to the screen. BTW: IMHO, Blank lines should be removed before running diff algorithm not at highlighting.

0002-Improve-highlighting-of-conflicting-delete-hunks.patch
Change the way the same overlapping delete hunks are highlighted and make it consistent witht the way non-overlapping delete hunks are shown.

0003-Extract-methods-in-Differ-class.patch
Make room for extending Differ class needed by Merger class. _auto_merge() is a generator because Merger may return more than one chunk in a single call.

0004-Replace-get_iter_at_line-with-get_iter_at_line_or_eo.patch
Minor fix for incorrect highlighting in some rare cases (EOF). Feel free to skip this patch when in doubt.

0005-Introduce-methods-dummy-that-will-be-overriden-lat.patch
Now, add a bunch of methods that do not do anything useful in Differ, but they will be overridden in Merger to make the auto merge mode work.

0006-Extract-linkmap-drawing-methods.patch
Extract some methods based on code related to drawing linkmap. They will be used by Merger to change the look of linkmap in auto merge mode.

0007-Add-auto-merge-classes.patch
Just add all new classes.

0008-Accept-4-parameters-and-run-auto-merge-in-this-case.patch
Change command line arguments check to accept 4 files and run auto merge mode.

Comment 20 Kai Willadsen 2009-07-07 06:58:40 UTC

*** Bug 169233 has been marked as a duplicate of this bug. ***

Comment 21 Kai Willadsen 2009-07-07 07:21:41 UTC

The first two patches look good to me, so I've pushed them. The get_iter_at_line_or_eof() patch looks reasonable to me as well, but I don't know how to reproduce the bug. Do you have a testcase? I'm still looking at the rest of the patches.

Comment 22 Vincent Legoll 2009-07-07 07:59:58 UTC

Looking at the pictures from the kink you sent, I wonder why for the picture for the first patch miss the blue highlighting on line 8 in the first pane, before the line 6 & 7 deletion, this looks like a bug. Is that bug a regression or is it already present ?

Comment 23 Piotr Piastucki 2009-07-07 08:14:10 UTC

I do not quite see your point. Would you please clarify? The pictures called before and after refer to results before and after applying the patch. There is no difference between line 8 in left and middle pane so there is no highlighting.

Comment 24 Vincent Legoll 2009-07-07 08:19:30 UTC

Ah OK I thought the line could be highlighted blue with no inline diffs showing, but maybe it would be misleading, so it is fine as is.

Comment 25 Piotr Piastucki 2009-07-07 16:10:54 UTC

Created attachment 137980 [details] [review]
Fixed MyersMatcher optimizations

There was a bug in MyersMatcher introduced with performance improvements. I am attaching fixed version of 0007-Add-auto-merge-classes.patch.

Comment 26 Kai Willadsen 2009-07-08 02:12:18 UTC

Created attachment 138013 [details]
Rebased and slightly modified version of the v8 patchset

This archive contains the remaining patches in the series from attachment #137933 [details], including the updated version of MyersMatcher from attachment #137980 [details]. I've removed the _get_matcher method in favour of a class member, and updated the later patch accordingly.

Comment 27 Piotr Piastucki 2009-07-17 18:14:26 UTC

Created attachment 138628 [details]
Slightly modified v8.1 (bugfixes + cleanup)

A couple of minor fixes and improvements + some code clean-up

Comment 28 Kai Willadsen 2009-07-17 22:48:49 UTC

The lastest version doesn't apply against head. The first two patches have already been pushed; the fifth fails to apply. I'll push the get_iter_at_line_or_eof() patch if you can provide a test case.

Speaking more generally, I wonder if it would be reasonable to add Merger to diffutil.py, and move IncrementalSequenceMatcher to matchers.py.

Comment 29 Piotr Piastucki 2009-07-18 13:36:03 UTC

Created attachment 138667 [details]
Auto merge v9

I have removed obsolete patches and fixed the one that did not apply against head. 
Regarding get_iter_at_line_or_eof(), the issue is reproducible only if there are 2 replace chunks that are longer than 8000 characters and they end at EOF. I suppose it happens very rarely, so the patch is not high priority. 
The issue is fixed by the following line:
ends = [get_iter_at_line_or_eof(b, l) for b, l in zip(bufs, (c[2], c[4]))]
The other 2 lines modified by the patch do not fix anything, they are changed just to make the code use get_iter_at_line_or_eof() consistently everywhere.
See http://piastucki.bdl.pl/meld/automerge_patches.html for a screenshot.

Moving all kinds all SequenceMatchers to matchers.py module - good idea, actually that's why I named the module like this.
Adding Merger to diffutil.py - I am not sure, but it seems to be OK as long as the code in diffutil.py remains easy to read.

Comment 30 Jeremy Morton 2009-10-29 17:41:32 UTC

I'd be really keen to see this patch make its way in to Meld.  Meld is a great diff tool but it'd be great if it functioned as a merge tool, as well - this would make it a complete graphical SVN solution I could use with RapidSVN.

Any news on how long this might take to get in?  The sooner, the better. :-)

Comment 31 Kai Willadsen 2009-10-30 06:54:50 UTC

This dropped off my radar for a while, but I too would love to get it in. Other than necessary minor fixes, there are some questionable additions to FileDiff. It feels like FileDiff has been hacked up a bit to fit in things for the new FileMerge class, and I'd much rather that any odd workarounds went into FileMerge.

Specific things I'm not mad keen about include:
 * the _after_files_loaded() hook - it does nothing in FileDiff
 * the _get_custom_textbuffer() accessor - I'm not sure why this is even necessary
 * the extraction of paint_pixbuf_at() - I guess it's okay, but it should probably just be a top-level utility function
 * the status text changes - I'm working on a patch which changes the implementation of Meld's status bar so that adding stuff to the statusbar can be done in a less hacky way. However, these changes can just be left out and applied later anyway.

In the quick review I did of the new classes, they looked pretty good. The challenge is minimising the collateral damage associated with getting them in.

Comment 32 Piotr Piastucki 2009-10-30 17:09:19 UTC

Hi,
I think it should be up to meld developers how, when and if this functionality is included, so the integration provided in the patch has merely 2 goals:
- it should work
- it should make as few changes to the existing code as possible (e.g. avoid any changes but method extraction)
Of course, I will try to help if any additional info is needed.
* _after_files_loaded() - I think having empty methods is acceptable as long as  FileDiff is considered to be a base class that exposes some extension points to its children, otherwise it should be reworked
* _get_custom_textbuffer() - kind of a hack. I added the 4th hidden buffer that is returned by this method to re-use existing logic that can load 3 files and to avoid showing one of them (common ancestor) at the same time.
Currently I am trying to improve Myers matcher performance by removing some Python code (which is slow) and replacing it with operations on maps (which are fast).

Comment 33 Piotr Piastucki 2009-11-03 15:52:39 UTC

Created attachment 146847 [details]
Rebased patchset with improved matcher

I am attaching a rebased version of the patch. Moreover, it contains a new version of matchers.py which includes 4 implementations of Myers algorithm. By default the fastest one (based on http://research.janelia.org/myers/Papers/np_diff.pdf) is used. The remaining implementation may be removed, I left them for the ones who are curious :)

Comment 34 Piotr Piastucki 2009-11-04 14:15:10 UTC

Created attachment 146913 [details]
Rebased patchset with matcher improvements

This patchset contains an additional patch that removes all unneeded stuff from matchers.py and fixes a couple of issues I have spotted
 
0004-Enable-auto-merge-mode.patch needs to be reworked as it is just an ugly hack that allows users to pass 4 command line arguments.

BTW: I have been looking at PatienceSequenceMatcher (both Python and C implementations included in Bazaar) which is supposed to simplify merges by making sure that no unique matching lines are missed. After a couple of tests I think it looks quite interesting and using it instead of Myers sequence matcher may also be an option. I wonder what bzr users think about it.

Comment 35 Piotr Piastucki 2009-11-06 07:26:35 UTC

Created attachment 147079 [details]
Changes in integration with FileDiff

Here is another patch which tries to address some of the issues mentioned by Kai. Instead of introducing dummy methods in FileDiff, _set_files_internal() has been split.

Comment 36 Kai Willadsen 2009-11-06 07:35:40 UTC

I'll have a proper look when I have time (next week?) but from a quick glance this looks *much* better to me. Thanks for pushing through with this.

Comment 37 Piotr Piastucki 2009-11-09 11:33:07 UTC

Created attachment 147265 [details]
Fixed patch

This patchset includes an additional patch that fixes linkmap actions and Myers matcher optimisations.

Comment 38 Kai Willadsen 2009-11-21 23:33:52 UTC

I've just pushed the first two patches. I'm attaching an updated patch set with formatting fixes for the classes added in patch 1 (was patch 3). Patches 1, 3-6 should all be squashed into a single commit, but I'm giving them individually here for convenience.

There are PEP8-related formatting issues remaining in patch 2 (no spaces after commas) that need to be fixed. As you said above, this patch is also not really the right way to do it. I haven't thought about it much, but maybe a --merge command line flag would be appropriate?

I've read through the main patch in a cursory fashion, and it looks reasonable to me. However, the lack of a good commit message is a real problem for this patch. The patch adds three new classes in 619 lines of code, but the commit message is just "Add auto merge classes". Could I ask that you please write a reasonably detailed commit message explaining what was added, what it does, and why?

For a better explanation than I can give, see:
 http://lists.cairographics.org/archives/cairo/2008-September/015092.html

As an imperfect example, look at the commit message I wrote for my (still-not-applied) cursor tracking patch, which is *much* smaller than this addition:
 http://mail.gnome.org/archives/meld-list/2009-September/msg00010.html
(the big patch is 0002)

Otherwise, this patch set is basically ready to apply.

Comment 39 Kai Willadsen 2009-11-21 23:35:13 UTC

Created attachment 148251 [details]
Updated patch set

Comment 40 Piotr Piastucki 2009-11-25 21:46:58 UTC

Created attachment 148484 [details]
Updated patch with more elaborated commit message

I have combined patches 1, 3-6 together and added some explanation to the commit message. Please let me know if more info may be needed.

Comment 41 Kai Willadsen 2009-12-04 01:38:35 UTC

I've committed these patches now.

There are some remaining problems (see below) but they can be worked out
post-commit. The big thing that should be fixed before a release is figuring out
what the command-line interface should be; the four-argument version that's
there now works, but it's not good. It would also be nice to have a way of
starting an auto-merge from within Meld, but that's not essential.

Questions, things that would be nice to fix, and future options:

 - find_common_prefix/find_common_suffix seem unnecessarily complicated. Unless
   it's actually faster to do the binary split (is it? I can't see why it would
   be) this should almost be a copy of os.path.commonprefix.

 - In MyersSequenceMatcher.__init__, there is the comment:
      "fields needed by preprocessor so that preprocessing may shared by more
       than 1 LCS algorithm"
   but there is only one LCS algorithm present. Is this a left-over from
   experiments with PatienceSequenceMatcher?

 - There are two pre-processing optimisations in MyersSequenceMatcher, and
   switches to turn them on and off. Why would anyone want to turn them off?
   If they're effective optimisations, can the options simply be removed?

 - In MyersSequenceMatcher.preprocess, we have:
       self.lines_discarded = m - j > 10 or n - k > 10
   is the constant a threshold for whether it is worthwhile using the
   post-discard version? It would be nice to have an explanatory comment.

 - In FileMerge.__init__ takes a num_panes argument, but surely three is the
   only sensible value?

 - In FileMerge._merge_files, why do we create a second merge.Merger?

 - FileMerge.set_files looks like it deals with fewer than four files, but as
   far as I can see this shouldn't happen, and won't work even if it does.

 - The start of Merger._auto_merge duplicates Differ._auto_merge. Why not just
   call Differ._auto_merge first and then do a pass to resolve any 'conflict'
   blocks?

 - It would be nice to have some test files available. We don't currently have
   any regression tests or anything, but it would be reasonable to start
   keeping a few simple testcases, etc. A good start might be the example files
   you use for http://piastucki.bdl.pl/meld/automerge.html?

Anyway - this is all pretty nice, and should definitely be featured in the
release notes. Thanks for all the work.

Comment 42 Piotr Piastucki 2009-12-04 08:57:45 UTC

1) prefix/suffix matching - yes, it is much faster to do the binary split, for more details see http://neil.fraser.name/news/2007/10/09/. Basically, Python built-in functions are much faster than the code written in Python so it is a good idea to replace loops in the code with calls to built-in stuff. That's why I would like to find a good way to use maps more heavily in my implementation of Myers diff, however, I have not found it yet :)

2) fields needed by preprocessor - yes, this is a left-over from the experiments

3) I cannot see any reason to switch them off, the switches are left just in case the optimizations cause any problems

4) is the constant a threshold for whether it is worthwhile using the post-discard version? Yes, it is. This is kind of a heuristic based on the tests I ran and it definitely requires a good comment.

5) Yes, that's right. 

6) Hmmmm, good question. I remember that I had some issues when the highlighting code tried to call the main Merger instance while the files were being merged (AFAIR highlithing was broken). But there were a lot of changes in this code since then and maybe this is not the case any more...

7) FileMerge.set_files, it does happen when you click Refresh for instance and it actually seems to work.

8) Yes, calling Differ._auto_merge is a good idea.

Comment 43 Piotr Piastucki 2009-12-14 08:48:29 UTC

Created attachment 149683 [details] [review]
Add call to Differ_auto_merge from Merger

I am attaching a patch that addresses issue #8 by adding a proper call to Differ._auto_merge in Merger._auto_merge.

Comment 44 David Hostetler 2010-02-07 01:17:57 UTC

Can someone clarify whether this patch would allow support for 2-way auto merge?

The focus seems to have been for scm-style merging which takes 4 file arguments (3 inputs, one output).  But an equally valid need exists for 2-way auto merge (2 inputs, one output).

I use Gentoo, and it's package management system includes configuration file merging that uses a command format like the following:

sdiff -s -o %merged %orig %new

Using an alternative like xxdiff, the command becomes:

xxdiff --decision --show-merged-pane --merged-filename %merged %orig %new

Or with kdiff3, it's:

kdiff3 --auto -m -o %merged %orig %new

But since Meld doesn't provide any command-line options for output filename nor explicit auto-merge, it's unusable for this purpose.

Thanks!

Comment 45 Piotr Piastucki 2010-03-12 12:27:20 UTC

Created attachment 155959 [details] [review]
Refactor auto merge code

I am attaching an enhanced version of the previous patch. Apart from replacing some redundant code with a proper call to Differ._auto_merge it also extracts AutoMergeDiffer class from Merger class.

Comment 46 Kai Willadsen 2010-03-13 01:19:50 UTC

When invoking auto-merge mode with this patch applied, I get a traceback ending in:

+ Trace 220938

File "/home/foo/meld/meld/filemerge.py", line 78 in _merge_files
```
for panetext[1] in merger.merge_3_files():
```

AttributeError: 'Merger' object has no attribute 'merge_3_files'


I assume that this is supposed to be merge_3_way. Also, while the commit message itself is fine, please keep in mind line length; lines should be wrapped to 72 characters.

Comment 47 Piotr Piastucki 2010-03-13 22:32:04 UTC

Created attachment 156082 [details] [review]
Corrected refactoring patch

Here is a fixed version of the previous patch.

Comment 48 Kai Willadsen 2010-03-14 01:22:16 UTC

Looks good, thanks. Pushed.

Comment 49 Kai Willadsen 2010-03-22 07:35:19 UTC

Created attachment 156717 [details]
Two minor patches addressing issues 3 and 4 above

These patches should address issues 3 and 4 that I mentioned above. Piotr, could you please have a look and check that they're okay?

Comment 50 Piotr Piastucki 2010-03-23 07:41:44 UTC

Kai, both patches look fine.

Comment 51 Piotr Piastucki 2012-11-09 09:33:20 UTC

Implemented in meld 1.7