Reputation: 1053
I have a branch A
with a lot of changes (including a lot of refactoring), so I decide to create a separate branch just for the refactoring. I branch out of A
into a new branch B
. I now have the same changes on A
and B
(compared to master
).
I delete all of the new functionality on B
since I want to only commit the refactoring. I commit my changes onto B
and open a pull request. I then checkout A
and pull B
. Now all of the commits on B
are applied to A
, essentially deleting the new feature, only leaving the refactoring.
Why did this happen? I expected to have some merge conflicts and preserve the changes on both branches. Instead, branch B
completely overwrote A
.
Upvotes: 0
Views: 766
Reputation: 489253
Lasse V. Karlsen has already provided an answer in the form of a comment, but here is another way to look at it. You asked:
Why did pulling a child branch overwrite changes on parent branch?
and then provided a description of the setup—this is a good thing: providing this description, that is—using phrases like "branching from a branch". The problem is that Git does not provide parent/child functionality in branches. By starting from a false assumption, you end up fighting with Git.
The key to understanding Git is to realize that branches are unimportant—at least, if by "branch" you mean "branch name". A branch name serves just one purpose: it finds some specific commit. What matters is not the branch name, but rather the commit itself. Unfortunately, the word branch is ambiguous: when we use the word to mean "some collection of commits", branches do matter, because it's the commits that matter. So branches don't matter, except when they do: not a good situation, but that's how things are.
Where we do find parent/child relationships is in the commits themselves. It's only there, though. The names we use to find these commits move about over time. I find this is best understood through drawings. Let's make some drawings:
I have a branch
A
with a lot of changes (including a lot of refactoring), so I decide to create a separate branch just for the refactoring. I branch out ofA
into a new branchB
. I now have the same changes onA
andB
(compared tomaster
).
I like to use single uppercase letters to draw the commits, so I'll rename these names, using branch-A
and branch-B
. First, though, we need to draw some of the original commits on master
.
To draw a commit, we write down its big ugly hash ID, and then draw one or more arrows coming out of the commit, pointing to the commit's parents. The hash IDs are very large (40 characters) and unweildy and too difficult to use, so I replace them with symbols or letters; here I'll use the letters.
Most commits have just one parent—the usual exception is a merge commit with two parents—so we'll start with this string of commits:
... <-F <-G <-H
Here, H
stands in for the last commit in the chain. It points—via its saved parent hash ID—to earlier commit G
, which in turn points to yet-earlier commit F
, and so on.
To find commit H
by its hash ID, Git stores the hash ID in a branch name, such as master
. Getting a little lazy on purpose (because drawing arrows in text characters is hard), we add the name like this:
...--F--G--H <-- master
Since we're on branch master
initially, we add HEAD
in parentheses, "attached to" the branch, to indicate that:
...--F--G--H <-- master (HEAD)
The name master now points to H
, which points back to G
, and so on.
Now we create a new branch name. This name has to point to some existing commit. The commit we pick will be the last commit in this new branch. We'll pick commit H
, because we'd like to start with all the same commits we have so far:
...--F--G--H <-- branch-A, master (HEAD)
This indicates that commits up through and including H
are on both branches, that we're currently working with commit H
, and that we're currently using the name master
to find that commit. As soon as we run git checkout branch-A
or git switch branch-A
, we get instead:
...--F--G--H <-- branch-A (HEAD), master
Nothing else has changed, but HEAD
is now attached to branch-A
. We're still using commit H
and commit H
is still the last commit on both branches.
Now we make some new commits. For simplicity I'll just draw two:
...--G--H <-- master
\
I--J <-- branch-A (HEAD)
When we do make a new commit, Git:
I
points back to H
, for instance;So when we made I
, the name branch-A
automatically advanced to point to I
. Then we made J
and the name advanced again, giving us this result-so-far. Note that commits up through and including H
are on both branches!
We now make another branch name, branch-B
, also pointing to the current commit, and switch to that branch name:
...--G--H <-- master
\
I--J <-- branch-A, branch-B (HEAD)
I delete all of the new functionality on
B
since I want to only commit the refactoring.
Here, you make a new commit—let's call it K
—that deletes new code:
...--G--H <-- master
\
I--J <-- branch-A
\
K <-- branch-B (HEAD)
Now, as far as Git is concerned, the only point of a branch name is to find some particular commit. (The commit itself then finds all previous commits.) So the name branch-A
is pretty much irrelevant. We can redraw this drawing without that name, to get:
...--G--H <-- master
\
I--J--K <-- branch-B (HEAD)
Commit K
takes out the new functionality, leaving only the refactoring. Since commits I-J-K
are the ones "on" branch B that aren't on master, the merge procedure to bring those into master gets you to the final state as represented in commit K
. This can be a real merge (git merge --no-ff
) or a fast-forward, not-actually-a-merge (git merge --ff-only
).
If we use the latter and put the name branch-A
back into the picture, we get:
...--G--H--I--J <-- branch-A
\
K <-- branch-B, master (HEAD)
If we use the former—a true merge—and again put the name branch-A
back in the picture, we get:
...--G--H----------M <-- master (HEAD)
\ /
\ K <-- branch-B
\ /
I--J <-- branch-A
(I skipped the letter L
just so I could use M
for "m"erge-commit here).
Note that in both cases, we end up with branch-A
(commits up through J
) already included in the merge result (those commits are now "on" master
).
Had you created branch-B
starting from commit H
, you would first have:
...--G--H <-- master, branch-B (HEAD)
\
I--J <-- branch-A
You can then make commit K
to produce:
K <-- branch-B (HEAD)
/
...--G--H <-- master
\
I--J <-- branch-A
If appropriate, you can create more commits:
K--L <-- branch-B (HEAD)
/
...--G--H <-- master
\
I--J <-- branch-A
Commits up through H
are now on all three branches, but commits I-J
are only on branch-A
. This situation lasts unless and until we move the branch names around. The names can be adjusted whenever, and however, we want. The commits are frozen for all time: we can redraw them to put them in more convenient places for drawing arrows pointing to them, but the connection from K
going backwards to H
is fixed forever.
If we now check out master
and merge branch-B
, using a forced-real-merge so that I don't have to draw the fast-forward case, we get:
K--L <-- branch-B
/ \
...--G--H------M <-- master (HEAD)
\
I--J <-- branch-A
Since commit L
can be found using commit M
—it points backwards to both H
and L
—we can delete the name branch-B
safely now:
K--L
/ \
...--G--H------M <-- master (HEAD)
\
I--J <-- branch-A
and we see that commits I-J
are still not in the set of commits found by starting at M
and working backwards. So they can still be merged. This merge cannot be a fast-forward instead of a real merge, so the result of such a merge requires a new merge commit, which I'll call N
:
K--L
/ \
...--G--H------M--N <-- master (HEAD)
\ /
I-----J <-- branch-A
Let's assume you did a true merge, but have kept all your names around, and hence now have this:
...--G--H----------M <-- master (HEAD)
\ /
\ K <-- branch-B
\ /
I--J <-- branch-A
The problem here is that commits I-J
are in fact merged, because M
reaches back to K
which reaches back to J
. The code in those commits is gone because the J
-to-K
difference includes deleting that code. But if we make a new commit, or series of commits, that are copies of I
and J
as applied to M
, we get something we can merge easily.
The command that copies commits one at a time is git cherry-pick
. We can do the job this way. We first make a new branch name, e.g., fix
, that points to commit M
, and switch to it:
...--G--H----------M <-- fix (HEAD), master
\ /
\ K <-- branch-B
\ /
I--J <-- branch-A
Then we get the hash IDs of I
and J
, or use the relative syntax trick, to cherry-pick each commit one at a time. Since there are only two commits, we can run:
git cherry-pick branch-A~1
git cherry-pick branch-A
as our two cherry-pick commands. These may have merge conflicts. If so, you just need to fix them as you go. The result will be new commits that refer to each other, and to commit M
, as their parents, and have as their snapshots the conflict-fixed snapshots you provide:
I'-J' <-- fix (HEAD)
/
...--G--H----------M <-- master
\ /
\ K <-- branch-B
\ /
I--J <-- branch-A
Here, I'
is the copy Git made of commit I
, and J'
is the copy of J
.
If there are many commits to copy, it's handy to be able to cherry-pick all of them in sequence. To do that, we need to give cherry-pick the right list of commits. That's a bit tricky, but the list ends at the commit identified by the name branch-A
. We can use Git's two-dot syntax to construct an expression by which Git will find all the commits in this list, with:
git cherry-pick master~1..branch-A
The expression master~
means go back to the first parent of commit M
, because master
means commit M
and the ~
suffix means step back one time, using the first parent for each step. This first parent notion is only meaningful for merge commits like commit M
: other commits have only one parent (so that any parent is the first parent). Merge commits always have the commit that was the branch tip before, as their first commit, so that's why master~1
works here.
There's a different problem—and a different graph—if Git did a fast-forward merge. Then, instead of:
...--G--H----------M <-- master (HEAD)
\ /
\ K <-- branch-B
\ /
I--J <-- branch-A
we have a graph best drawn like this:
...--G--H--I--J <-- branch-A
\
K <-- branch-B, master (HEAD)
Now there's no easy way to find the hash of commit H
, other than to run git log
and look for it. So in this situation, the all-in-one cherry-pick command you'd need—whether or not you make a fix
branch (I would advise making one)—would be:
git cherry-pick <hash-of-H>..branch-A
The end result of this, assuming you create a new name fix
and check it out, is:
...--G--H--I--J <-- branch-A
\
K <-- branch-B, master
\
I'-J' <-- fix (HEAD)
which allows the name master
to be fast-forwarded to commit J'
, once you're sure that you have all your fixes in J'
as compared to K
. If you like fast-forwards, do that; if you prefer true merges, do one of those; either way, you now have the updates you wanted, brought in via new commits that are, in effect, copies of the earlier commits.
Upvotes: 1