Reputation: 21
I can use "graft" to copy the same changes I made in a particular commit from one branch to another. What I'd like to do is something similar, except I want to copy all the changes from a branch.
That is, I start with a branch I'll call branch A. I make a new branch off branch A called feat1, which is adding a new feature. I make several commits in feat1, then merge it back into branch A.
I want to merge all the changes made in the feat1 branch into another branch I'll call branch B. But I don't want to merge everything from branch A into branch B, just the changes made over the course of the feat1 branch. I think I can accomplish this using a series of grafts, one for each commit in my feat1 branch, but I think there should be a better way. Is there a standard way of accomplishing this?
I'm currently using Mercurial, although I'm planning on transitioning to git at some point, so I'd like to know how to do it in either system.
Upvotes: 1
Views: 989
Reputation: 487725
(Note: this is cross-posted to git and mercurial, so we need an answer that covers both. I apologize for how long this is, too, but there are a lot of concepts to understand. If this is TL;DR, skip down to the last section, Avoiding all the copying. But note that you will then have to read upwards to see what I am talking about.)
Both Git and Mercurial merge the same way.1 In particular, branch names do not matter. What matters is not the names; what matters is the commit graph.
1There are many tiny differences and even a few relatively big ones, but what I mean here is that the overall idea is the same. Git also supports the concept of what Git calls a fast-forward merge, which is not a merge at all. This concept makes no sense in a Mercurial repository and simply cannot happen: Mercurial does not do fast-forwards. Running hg merge <commit-specifier>
is roughly equivalent to running git merge --no-ff <commit-specifier>
.
In both Git and Mercurial, each commit records, in some manner, the ID of its parent commit, or—in the case of a merge commit—its parents (exactly two for Mercurial, two or more for Git). Because each commit has a unique ID of its own, and each commit records its parent or parents, we can draw a graph—it starts out as a tree, which is simpler, and which I will illustrate here—of these commits:
A <--B <--C
Here we have a simple repository with only three commits in it. Commit C
is the latest. It has a single parent B
, so C
remembers the ID of B
. B
, likewise, remembers the ID of A
. (Note that A
is unaware of B
and B
is unaware of C
: the arrows go only backwards.)
Nothing about any commit can ever be changed (this is more true in Git, but pretty much true in Mercurial too). So we don't need to draw the arrows as arrows; we can just use connecting lines, as long as we remember that the VCS has to follow them in a backwards direction. This is handy because we might now decide to add a new commit D
whose parent is B
rather than C
:
A--B--C
\
D
This is the point where Mercurial and Git tend to diverge: in Mercurial, we often achieve this result (on purpose) by putting commit D
on a different branch. Mercurial records the actual name of the branch on which any commit was made, so from then on, Mercurial knows D
is on branch dev
or whatever. Git uses a completely different scheme: Git does not know or care where a commit was made; Git finds commits by having names remember a last commit, and working backwards the way both systems can, following the internal arrows connecting commits.
So, in Mercurial, we might say:
default: A--B--C
\
dev: D
Each commit is on the branch that goes with the line we put the commit on. But in Git, we might switch to drawing them like this:
C <-- master
/
A--B
\
D <-- dev
That is, the name master
identifies commit C
, and the name dev
identifies commit D
. Commits A
and B
are now on both branches.
At this point, Mercurial users might declare that Git is just nuts, and many would agree with them. But Mercurial users are not off the hook here, because we can do this in Mercurial too: we can make commit D
be on branch default
, and yet still have B
as its parent. That is:
A--B--C
default: \
D
is valid in Mercurial too! And in modern Mercurial, we can set two bookmarks to remember commits C
and D
, and we have the same situation as in Git.
Hence, it's important in both systems to understand how the commit graph works. Because Mercurial's branch system is more rigid, you can often get away without this understanding—but eventually, you do need to know about it.
Now, let's look at your text description, and draw the graph, because it's the graph that matters.
I start with a branch I'll call branch A. I make a new branch off branch A called feat1, which is adding a new feature. I make several commits in feat1, then merge it back into branch A.
I want to merge all the changes made in the feat1 branch into another branch I'll call branch B. But I don't want to merge everything from branch A into branch B, just the changes made over the course of the feat1 branch. I think I can accomplish this using a series of grafts, one for each commit in my feat1 branch, but I think there should be a better way. Is there a standard way of accomplishing this?
There's a piece missing here because in Mercurial, you almost certainly really started with default
, just as Git users start with master
. Let's draw at least one commit on default
, and then worry about branchA
and feat1
and branchB
.
default: A
\
branchA: B--C-...--M
\ /
feat1: D--E
Commit M
is your merge, made by running hg checkout branchA; hg merge feat1
.
The way merge works is that it looks, not at the branch names, but rather at the commit graph. Before M
exists, the graph reads:
...--C--...
\
D--E
I put in the ...
here because there might be some commit or commits after C
, such as F
or F-G-H
or whatever. Let's assume that there are. The effect on Mercurial is less important than it is on Git, because in Git, this forces Git to do a real merge rather than a fast-forward non-merge, but it's useful here to help illustrate how merge works. Let's use just one commit F
:
...--C--F
\
D--E
To achieve a merge, both VCSes look at the graph at this point. They take the current commit—in this case, F
—and the requested ("other" or --theirs
) commit, in this case E
—and search backwards through the graph for the best common ancestor commit. That commit is obvious from the drawing: it's commit C
!
So, at this point, both VCSes do the logical equivalent of:
C
vs commit F
, to find out what we changed;C
vs commit E
, to find out what they changed;C
;if all goes well, make the new merge commit M
using F
as the first parent, and E
as the second parent:
...--C--F---M
\ /
D--E
So, now that you have this, let's go on to the next part of your text:
I want to merge all the changes made in the feat1 branch into another branch I'll call branch B.
This new branch, branchB
, does not spring up out of nowhere. You must create it. The method here is a little bit different in Git and Mercurial, but it ends up working out the same ... well, mostly the same.
In Git, you now pick one of the total set of commits that exist (A
through F
plus the merge M
) and choose that to be the commit to which the name branchB
will point. Then run git branch branchB <commit-specifier>
, and the name branchB
now points to that commit.
In Mercurial, you now check out some existing commit: hg update -r <rev>
or similar. Pick one of the commits, perhaps A
in which case we can just use hg update default
. Then run hg branch branchB; hg commit
to make a new commit, which creates branchB
.
In both systems, a branch cannot exist if there are no commits on that branch; but in Git, any commit can be on many branches, so we can make branchB
exist by pointing it to commit A
. In Mercurial, a commit is only on one branch, so we need to make a new commit to cause branchB
to come into existence. So the pictures differ a bit.
Here is the Git picture:
A <-- master, branchB
\
B--C--F---M <-- branchA
\ /
D--E <-- feat1
and here is the one for Mercurial:
branchB: N
/
default: A
\
branchA: B--C--F---M
\ /
feat1: D--E
We can, at this point, make a gratuitous commit in Git, just to make the pictures match (except that since Git branch names move, we put them on the right with arrows pointing into the graph; Mercurial branch names are solidly fixed, so we can have them sit on the left and mark their commits forever, as long as our graph does not get very big anyway). Git will write the new commit N
,2 and move the current branch name to point to that new commit.
But I don't want to merge everything from branch A into branch B, just the changes made over the course of the feat1 branch. I think I can accomplish this using a series of grafts, one for each commit in my feat1 branch, but I think there should be a better way. Is there a standard way of accomplishing this?
In both Git and Mercurial, you must in fact copy the commits. The graph is the commits; the commits D
and E
that you made on feat1
are stuck where they are. It does not matter whether the branch names move (Git) or are solidly fixed (Mercurial), because the commits themselves are immutable.
In Mercurial, you are correct that hg graft
copies commits. You can use a single command to copy all of the feat1
commits. Because commits are specific to one particular branch, it's easy to copy all the feat1
commits.
In Git, the git cherry-pick
command copies commits. You can use a single command to copy all of the commits you want here, too, but because commits D-E
are on two branches—they're on both feat1
and branchA
now—you need to use a more complex specifier. Git's convenient specifiers—maybe not that convenient—use graph operations: feat1~2..feat1
would specify both commits D
and E
, in this case, as would branchA^..feat1
. Mercurial can do this too, but you don't need this as often in Mercurial.
2Note that we must git checkout branchB
and perhaps use git commit --allow-empty
. The --allow-empty
flag tells Git that it should make the new commit even though the index—which is a Git-specific concept; Mercurial has no index—exactly matches the current commit. The git checkout
step has the side effect of attaching the name HEAD
to the branch-name, so that Git knows which name is the current branch. (Mercurial records the current branch name in a hidden data structure called the dirstate that, unlike Git's index, you do not need to know about.)
Since your goal is to avoid using hg graft
or git cherry-pick
, let's consider how we could achieve this. Note that it's not always possible, and even when it is possible, it sometimes requires a lot of foresight and planning. But the key is this: Both Git and Mercurial use the merge base from the graph, to merge in the various features.
Let's take the final Mercurial graph from above, where the two feat1
commits will have to be grafted in order to have them affect commit N
:
branchB: N
/
default: A
\
branchA: B--C--F---M
\ /
feat1: D--E
Now, suppose that, when we went to create the branch feat1
, and then implement the feature, we knew in advance that we wanted to be able to just run hg update branchA; hg merge feat1; hg update branchB; hg merge feat1
.
To make this work, we must have the first commit on feat1
come after an earlier commit in the graph. Specifically, it must come after some commit that is an ancestor of the tips of both branchA
and branchB
. Commit C
is too far down the path towards branchB
: it's not an ancestor of commit N
on branchA
.
The ideal commit is the one that is at the point where branchA
and branchB
first come back together. That is, it's the commit whose hash ID Git will spell out by running git merge-base branchA branchB
: the most recent (in graph terms) commit that is an ancestor of the two tip commits.
Note that because we have not yet started on feat1
, what we must actually have at this point is just:
branchB: N
/
default: A
\
branchA: B--C
Mercurial does not have a particularly convenient tool for finding the desired hash ID,3 but because Mercurial commits are firmly stapled to their branches, it's usually tremendously obvious anyway. In this case, that's commit A
, which is the last commit on default
. So we can:
hg update default
hg branch feat1
and then write and commit our code:
branchB: N
/
default: A
|\
feat1: | D--E
\
branchA: B--C
Meanwhile commit F
went into branchA
as before, so we hg update branchB
and find that we now have:
branchB: N
/
default: A
|\
feat1: | D--E
\
branchA: B--C--F
We run hg merge feat1
and get:
branchB: N
/
default: A
|\
feat1: | D------E
\ \
branchA: B--C--F--M
where the first parent of M
is F
and the second parent is E
, just as before. But now we can hg update branchB; hg merge feat1
. The merge base of N
and E
is commit A
; Mercurial compares A
to N
to see what we did, and A
to E
to see what they did. Mercurial builds the new merge commit O
and commits it:
branchB: N----------O
/ /
default: A /
|\ /
feat1: | D------E
\ \
branchA: B--C--F--M
Aside from the fact that one uses different commands, and the branch labels themselves move about, the process is identical in Git.
3You can find the commit using -r
options and graph-oriented revision specifiers. You want the last commit, numerically speaking, that satisfies ancestor(tip-of-branchA) & ancestor(tip-of-branchB)
. Update to that commit, and then create the feat1
commits.
In Git, to find the hash, simply run git merge-base branchA branchB
. Having found the hash, point a new branch name at that hash ID, check out that branch, and begin committing.
Upvotes: 6