How do I get a git conflict between a fork and the master branch

Question

I have forked a project, where added and modified files.

After a while, I realised that I should update a few configs from the master branch. I tried to git-pull the master and other things, and I ended up losing a few files (somehow).

Now I deleted the whole directory (probably not very smart) and I'm trying to load my fork from the last commit as following:

> git clone [URL of the project]
> git checkout [code of the last commit in my fork]
> git pull origin master

and all it says is:

* branch            master     -> FETCH_HEAD
Already up-to-date.

which does nothing, and doesn't implement any changes. I would expect some conflicts in some files, so I was wondering how do I get to that point where the two versions (my fork and the master) will show conflict (typically with something like >>>>>>>> HEAD ...)

torek · Accepted Answer

There's a sort of basic problem here: it's not branches that result in conflicts. Or, perhaps slighlty better: branches are necessary, but not sufficient. (But this also depends on how we define branch.)

To have a conflict in Git, you must:

perform a three-way merge, using git merge¹
using two input commits (which will come from branches, but it's not the branches that matter here)
whose commit graph results in Git finding a third merge base commit that is different from the other two inputs, and, last
have both of the two commits have changes to the same lines of the same file(s).²

In all of this, the term fork is entirely irrelevant. A fork is merely a clone with some special semantics provided by whatever hosting provider has given you a clicky-web-interface "fork" button, or web-accessible "fork" API. If you forked a repository, you cloned it. If you cloned your clone, you've cloned it again. The important thing is which commits you have, by their hash IDs—which are their true names—even though you find those commits by branch names.

Every commit remembers the hash ID of its immediate parent,³ which results in a backwards-pointing chain:

... <-F <-G <-H

where H stands in for some commit hash ID. Commit H remembers the hash ID of its predecessor commit. Rather than use a big ugly hash ID we just use another single letter to stand in for it, so we call that G. So commit H points to (holds the hash ID of) commit G. Commit G in turn points to F, which continues to point backwards.

Having made a clone, or just a bunch of commits in your own repository—all that matters are commits—you now have something more like this:

          I--J   <-- branch1
         /
...--G--H
         \
          K--L   <-- branch2

The branch names, branch1 and branch2 here, hold the hash ID of the last commit in the branch. Git updated them automatically, e.g., when you used git checkout on the branch and made new commits, so that as you made the new commits, the branch grew and the name continued to point to the last commit. So now they point to commits J and L respectively.

If you now run git checkout branch1 and git merge branch2, Git will treat commits J and L as the two inputs that you provide, and will—on its own—find the best shared commit, which in this case is obviously commit H, that's on both branches. Git will then:

compare all the files in H to all the files in J—git diff --find-renames H J—to see what you changed, and
compare all the files in H to all the files in L—git diff --find-renames H L—to see what they changed.

The merge operation tries to combine those two sets of changes, applying the combined changes to the files in H, so as to make a new set of files that can be used for the snapshot in the new merge commit.

If this combining succeeds, Git will make new merge commit M on its own:

          I--J
         /    \
...--G--H      M   <-- branch1 (HEAD)
         \    /
          K--L   <-- branch2

HEAD here indicates that this is the branch we're on, and since we're on branch1, that's the name that Git updates to hold the hash ID of the new commit M. Commit M has a snapshot made by applying the combined changes to the snapshot in H.

If there are merge conflicts, Git stops without making M at all. Git leaves a mess for you to clean up.⁴ You must do your own combining and then tell Git that you've successfully fixed all the conflicts and produced the correct merge result. You then use git merge --continue or git commit (either one) to finish the merge and make commit M and the snapshot that you built.

¹Git also uses three-way merge to implement git cherry-pick and git revert, as well as a few other special cases, so you can get merge conflicts without using git merge. But git merge is the usual source and is what you're describing, so you can think of this as "when you run git merge" without too much loss of accuracy.

Note that git pull means run git fetch, then run a second Git command, typically git merge. So git pull is just fetch + merge.

²There are some sticky details on what "the same" means, both for files—which can be renamed—and for lines, which might merely abut rather than overlap. But again, you can just think of it as "different change to same lines of same file" = "conflict" and be close enough.

³For merge commits, the commit remembers all of its parents—usually just two—and hence points back not only to the commit on the branch you were on when you made the merge, but also to the commit that was the tip of the branch that you merged, at the time you made the merge.

⁴The messy merge state occupies the index, which we haven't described here at all. Git also leaves you its best attempt at combining the conflicting changes in files in the work-tree, and leaves information about the merge recorded for git merge --continue, which runs git commit, or for git commit, to pick up to make the merge commit.

You can back out of everything using git merge --abort, which puts everything back to the way it was before you started the git merge.

How do I get a git conflict between a fork and the master branch

Answers (1)

Related Questions