NewUser
NewUser

Reputation: 19

Why no merge conflict when merging file that was changed in git?

I am completely new to using git as a version control system for development and am still in the process of learning it. I am confused about merge conflicts during pull requests. As per my understanding a merge conflict generally occurs when the same line of a file are different in two different branches. I have a git repository with following tree structure.So i created a "dev" branch from "master" and then "D1" and "D2" branch from "dev". master->dev->(D1,D2) In the "dev" branch i have a file name "file1.txt" with first line as 1111111111111 and in the "D1" branch i am having the same file "file1.txt" with first line as AAAAAAAAAAAAAAA. I created a pull request to merge the "D1" branch to "dev" and expected that there would be a merge conflict as the first line "file1.txt" in both the branches are different.

But git shows Able to merge message indicating no merge conflict and the "dev" branch changes are overwritten by "D1" branch changes. Any idea what i am missing.

Upvotes: 0

Views: 5622

Answers (2)

torek
torek

Reputation: 489998

There is a lot to learn here. Some—well, at least one—have referred to Git's learning curve as a "learning wall". Adding web services like GitHub atop Git actually make it harder, in my opinion: it's no longer possible to tell whether something is in Git, or provided by GitHub. Pull requests actually fall into the GitHub-provided category. But they build on Git's merge verb (combined with the fact that GitHub stores both your repository, and someone else's repository).

Let's put all that aside for a while, and start with Git itself. The key to understanding whether—and when—there will be a merge conflict in some file—is hidden behind some graph theory.

What's in a commit

First, let's mention that commits hold files. More precisely, each commit holds a snapshot of all of your files, as of the time you made that commit. But each commit also has extra information about the commit, such as who made it, when, and why (a log message).

In Git, every commit is uniquely identified by its hash ID. This is a big, ugly, mostly unreadable, entirely-useless-to-humans string of characters such as 5d826e972970a784bd7a7bdf587512510097b8c7 (an actual commit in the Git repository for Git). These things are how Git finds the commits, though except for copy-paste or link purposes, I don't recommend that you use them much. :-)

Still, it's important to know that this is how Git does it, because almost every commit also lists the raw hash ID of at least one parent commit. The parent is the commit that comes before this commit. The one commit that is sure not to list a parent is the very first commit ever made, which of course has no previous commit.

What this all means is that starting from the last commit on any branch, Git can work backwards, to previous commits. Git then has a table going from human-readable branch names like master to raw commit hash IDs. To add a new commit to a branch:

  1. Git gets the current commit hash ID from the branch name.
  2. Git puts that into a new commit, as the new commit's parent. Git also puts in the new snapshot, of course, and your name and so on. This produces a new, unique, big ugly hash ID.
  3. Git then writes the new commit's hash ID into the name master.

When something holds a commit hash ID, we say that thing points to the commit. In this way, a branch name always points to the last commit on the branch. The last commit points back to its predecessor, which points back another step, and so on, so Git can work backwards from there. This stepping-back, one commit at a time, is how Git holds the history of everything you—and everyone else—ever did: each commit records a snapshot in time, and each commit (except the first) has a "previously, things looked like ..." history link.

Drawing the graph

When you make a new branch, Git makes a new table entry so that both branches point to the same commit. For drawing purposes, let me use single uppercase letters for each commit, instead of a big ugly hash ID. Then we might have this:

... <-F <-G <-H   <-- master, develop (HEAD)

Once a commit is made, nothing inside it can ever change. So H always points back to G, and so on. (The branch names can and do change all the time, usually to accommodate new commits.) So again, for drawing purposes in StackOverflow text, I'll leave out the internal arrows, and keep only the branch-name arrows:

...--F--G--H   <-- master, develop (HEAD)

The name HEAD is attached to one of the branch names. That's how Git knows which branch we're using, because when we make a new commit Git has to update this branch-name. Let's make one new commit now, and call its hash I:

...--F--G--H   <-- master
            \
             I   <-- develop (HEAD)

Now let's switch back to master and make a new commit there, which we can call J:

             J   <-- master (HEAD)
            /
...--F--G--H
            \
             I   <-- develop

Note that commits H and earlier are on both branches.

Merge as a verb

If we now ask Git to merge develop (i.e., commit I) back into master (i.e., commit J), Git will find the best common ancestor commit. Loosely speaking, this is the first shared commit Git can find by working backwards from both branch tips. In this case, that's quite obviously commit H.

Having found the common ancestor commit—which Git calls the merge base—Git now has to figure out what we changed, and what they changed. That is, Git has to diff commit H, the merge base, against the two branch tips: --ours, which is commit J, and --theirs, which is commit I. This actually requires two separate git diff commands:

git diff --find-renames <hash-of-H> <hash-of-J>   # what we changed
git diff --find-renames <hash-of-H> <hash-of-I>   # what they changed

Git can now combine these two sets of changes. We touched some file(s), and they touched some file(s). When we and they touched the same files, we changed some line(s) from the merge base to our commit, and they changed some line(s) from the merge base to their commit.

A conflict occurs if Git decides that we touched the same lines, but made different changes to those lines. Same here is mostly obvious—if I changed line 17 and they changed line 17, we obviously changed the same lines. However, same can be a little odd: if we both added different text at the end of the file, for instance, Git doesn't know which order to put them in, so that is also a conflict.

But as long as we touched different lines, or different files, there is no conflict: Git applies both sets of changes to each merge base file, to get the merged result. Git can then make a new commit from the combined changes.

The combining process is called a three-way merge. See also Why is a 3-way merge advantageous over a 2-way merge?

Merge as an adjective or noun

This new commit has a peculiar property; let's show it by drawing the result:

             J
            / \
...--F--G--H   K   <-- master (HEAD)
            \ /
             I   <-- develop

What's happened here is that new commit K remembers two previous commits: first, its own previous commit was J. That's K's first parent. But K has a second parent, too: it also remembers that it came from I. This allows Git to do a lot less work in a future merge, as it changes which commit will be a merge base later. (This is the most complicated bit of the graph theory, which we'll just skip over. 😀)

The process of combining changes and making a new commit is the verb form of the word merge, i.e., to merge. The commit made by this process uses the same word as an adjective: K is a merge commit. Git and Git's users often shorten this to say K is a merge, which uses the word merge as a noun.

A merge commit is simply any commit with at least two parents. The git merge command often makes such commits. (But not always! This is yet another part of Git's steep learning curve. For now, let's just ignore that, because GitHub tries—semi-successfully—to hide this from you.) In the end, then, when you have a merge (noun), it was made by the process of to merge (verb). The process uses three inputs—the merge base and two branch tips—to produce the combined changes. The git merge command then makes the noun-form result.

It's very important to grok this merge process, which mostly just takes time and practice. The reason is that it's not just git merge that does the merge as a verb action: many other Git commands do it as well, they just don't make merge commits when they are done. Every time you run git rebase, git cherry-pick, or git revert, you will be using Git's merge machinery. Even the seemingly-simple (deceptive, I think) git stash uses Git's merge machinery. When Git gets it right on its own—which is most of the time—you don't have to think about it, but when it goes wrong, you need to know what to do.

(With Git, I find it's also helpful to set the configuration option merge.conflictStyle to diff3, so that when Git leaves a confused mess of a conflicted merge for you to fix up, it shows the input merge base along with the two branch tips. Other people like to use fancy window-based merge tools instead, so this is something of a matter of taste.)

On GitHub pull requests

Now that we have the above as background, let's look at how GitHub's "make a pull request" button really works. In order to tell you whether your branch can be merged with someone else's branch, what GitHub does is to actually do the merge, as a test-merge.1 This test merge that GitHub does is not on any branch at all, but it still goes through the same general idea. If there are no merge conflicts, the test succeeds and GitHub says able to merge. If there are conflicts, GitHub throws away the test merge—it can't complete it any more than Git itself can—and tells you that there are conflicts.

If there are conflicts, it is, of course, up to you to figure out what to do about it. Now you'll need to understand the three-way merge process described above, which requires understanding the graph—or, if not really understanding all the graph theory, at least having a good idea of what a merge base is. You can find the base, compare it to the two tips, and see how the conflict came about. (Or, set merge.conflictStyle to diff3, and Git will leave the source of the conflict in the work-tree, where you can edit it directly.)


1I'm skipping over some important aspects of how Git transfers commits from one repository to another here. GitHub, as I mentioned, has both your and their repository on the same web-site, so they can and do cheat here—they don't have to transfer anything anywhere, they have all of it. Similarly, they don't actually run git merge, because that requires a work-tree: the repositories on GitHub are all so-called bare repositories, with no work-trees. But all of these are finicky details that spoil the nice overview idea of "GitHub runs a test merge": they do actually run a test merge. They just have a lot of sideways tricky bits they have to use to achieve it.

It's also worth mentioning, in this footnote, that GitHub do the equivalent of git merge --no-ff here. There is no option to do the equivalent of git merge --ff-only or git merge without any fast-forward control knob.

For pull request N, the pull-requested commit itself is fetch-able under the reference refs/pull/N/head. If the test merge succeeds, it's under the reference refs/pull/N/merge.

Upvotes: 7

bananaspy
bananaspy

Reputation: 551

As per my understanding a merge conflict generally occurs when the same line of a file are different in two different branches

This is not quite like that. Git detects conflict when you change the same line of a file in two different branches and then try to merge them together.

In your example you have changed the first line of a file in branch D1 but didn't touch it in dev branch, so the merging process looks like "take a first line change from branch D1 and apply it to the first line in branch dev".

On the contrary, if you (or somebody else) change the first line of a file in branch dev after creating D1, and then you change that line in D1, the merge process will look like: "there was a change of the first line in branch D1 and also in branch dev - what change should I consider to be the the main one?" - this is what merge conflict looks like.

Upvotes: 7

Related Questions