oneself
oneself

Reputation: 40301

CVS Merging Algorithm

What algorithm does CVS use when merging two branches (using the -j)?

Thanks

Upvotes: 2

Views: 2332

Answers (3)

Michael Krelin - hacker
Michael Krelin - hacker

Reputation: 143219

The CVS uses three revisions of file merging the differences between the two into the third. The actual merge doesn't use any information besides the three files checked out. You can see the details in CVS source code. The function in question is RCS_merge in src/rcscmds.c, which uses call_diff3 from diff/diff3.c (or something like that). You can, of course, satisfy your curiosity by looking up CVS sources yourself.

Upvotes: 0

olsner
olsner

Reputation: 991

There are two different forms of the "merge" command you can use, that do subtly different things:

  1. cvs up -j TAG
  2. cvs up -j TAG1 -j TAG2

The difference between the variants is how the "base" revision is selected, but the basic algorithm of either variant is that for each file, a diff between two revisions selected by CVS is applied on top of your current working copy.

In the first form, the merge base is the common ancestor of the given TAG and the working copy revision. So let's say your local revision is 1.38 (rev #38 on HEAD) and you're merging 1.34.4.2 (rev. 2 on branch 4 of rev #34 on HEAD) - the common ancestor will be 1.34. I believe this variant does a 3-way merge using the two diffs 1.34..1.38 and 1.34..1.34.4.2, producing conflicts where they mismatch.

In the second form, you're specifying the base revision yourself, so the end result is about the same as cvs diff -r TAG1 -r TAG2 | patch except for getting conflict markers from CVS.

Upvotes: 3

Ash
Ash

Reputation: 62135

I remember a reading a while ago that CVS merge actually uses the diff3 algorithm to perform merging.

This PDF article A Formal Investigation of Diff3, by Sanjeev Khanna, Keshav Kunal, and Benjamin C. Pierce from the Universtiy of Pennsylvania describes the diff3 algorithm in detail.

If focuses primarily on the properties of the merging algorithm itself, not on how it integrates with CVS.

In answer to your questions:

Tag, date awareness

From the CVS man page:

-j tag[:date] Merge in changes from revisions specified by tag or, when date is specified and tag is a branch tag, the version from the branch tag as it existed on date

2 or 3 way and text/binary awareness

diff3 defaults to a plain text diff. It compares (diffs) 3 versions of the file.

From the diff3 man page:

If `diff3' thinks that any of the files it is comparing is binary (a non-text file), it normally reports an error, because such comparisons are usually not useful. As with 'diff', you can force 'diff3' to consider all files to be text files and compare them line by line by using the '-a' or '--text' options.

Base version during comparison

The base version, according to the linked article, is the last common version (O) between the two current versions of the file (A and B). It first uses the 2 way diff algorithm to find the longest common subsequences between O and A, and O and B.

Then (quoted from the article) it:

... takes the regions where O differs from either A or B and coalesces the ones that overlap, leading to the alternating sequence of stable (all replicas equal) and unstable (one or both replicas changed) chunks shown in Figure 1(c).3 Finally, it examines what has changed in each chunk and decides what changes can be propagated, as shown in Figure 1(d)—here, the second chunk is changed only in A (by inserting 4, 5), so this change can be propagated to B, but the fourth chunk has changes in both A and B, so nothing can be propagated.

Upvotes: 6

Related Questions