Lucas Lima
Lucas Lima

Reputation: 902

Understanding git conflicts

I think I might be misunderstanding something about how git handles things, and, therefore, I'm facing some rather annoying conflicts.

I start a new branch A, from master, and start creating new files. Eventually, from branch A, I create branch B, and start working on that as well. A branch A continues to be developed, B needs the changes made in branch A, so I merge A into B, and continue working on both of them.

Times goes on, branch A continues to be developed, until it is merged into master. At this point, I think that, now it has been merged into master, I can do a simply merge master into B, and get all changes from A, and from everyone else as well.

Problem is, now that I'm trying to do that, I'm getting multiple conflicts of "Both added", "Both modified", and such - but on files I haven't changed. I totally understand the conflicts in the files I changed - I caused them, I know it full well.

Thinking my explanation my get confusing, I ventured into Google Slides and created this "amazing" drawing to illustrate my scenario (in which the arrows represent merges, as in "merged master into A", unless they point to the first commit in the branch - in which case they mean "branched off from here"; that is, pretty standard git notation, apart from the arrow direction - not sure here): er


Regarding files other people changed, I don't understand why are they listed as conflicts. That is, there is a file which has been changed by the time I created branch A, and that same file got changed again when I tried to merge master into B. Sure, branch B still has a super old version of that given file, but, still, shouldn't git recognize that they have not been changed in my commits, and just overwrite them, or whatever it does? What am I missing here?

EDIT: just to clarify, I'm the sole contributor of both branches A and B.

Upvotes: 3

Views: 519

Answers (2)

matt
matt

Reputation: 534893

shouldn't git recognize that they have not been changed in my commits, and just overwrite them?

No.

It doesn't matter that you didn't change the file. There is nothing in git's head about who created the commits or who gets priority. Neither master nor b is "better" in some way.

The point is only that the file as known to master and the file as known to b differ in a way that cannot be automatically reconciled.

Git can automatically reconcile some differences, obviously; for instance, if a file is different in one branch because line 1 was changed and different in the other branch because line 1000 was changed. But sometimes it just throws up its hands and lets you sort it out. That's not because of who changed the file, it's because of the nature of the differences.

As for what the differences are, I think it will help to read https://stackoverflow.com/a/56053884/341994 — that is best essay I have ever read on what a merge is and how it works. It's all about the LCA and the diffs from it to the ends of the branches. In your diagram, the LCA is the second commit from the far left, all the way back at the point where A was first branched off. All the differences between that commit and the end of master, and all the differences between that commit and the end of b, must be accounted for in order to merge. Well, git is telling you that for this file it can't do that. And if you think about how the file has changed from there to the end of master and from there to the end of b, you will see why.

It doesn't matter who changed the files or how you draw the diagram. It doesn't matter that there is a path that describes (in your mind) the history of what happened. All that matters is three states of things: the state of things at the time of that very early commit, the state of things at the end of master, and the state of things at the end of b.

Upvotes: 1

torek
torek

Reputation: 487735

Your graph drawing is very pretty, but it's rather misleading.

Let's start with a question: which branch are the blue commits on?

This is a trick question. They're not on a branch, they're on several branches, plural. If you say they're on branch A, well, that's true, but they're also on master and most of them are also on branch B.

In your drawing, there are no identifiers for each commit. This makes them hard to talk about: I could say "the second-from-left grey commit" or "the rightmost blue commit", but that's kind of unwieldy, so let me redraw the graph as text, using single uppercase letters in each commit. I also won't use arrows here as they're too hard to do in text.

A--B--C--D--E--F--G--H--I--J--K--L   <-- master
    \        \            /
     M--N--O--P--Q---R---S   <-- branch-A
         \        \
          T--U--V--W--X--Y--Å--Ø--Z   <-- branch-B

That is, the tip commit of master is commit L. The tip commit of branch-A is commit S. The tip commit of branch-B is commit Z.

In Git, a branch name, like master, always points to one single commit. You get your choice of which commit; the commit you select this way is the tip commit of the branch. Any earlier commit that is reachable from this tip is also on the branch. So by starting at L and working backwards—leftwards, in these drawings—following the arrows from commit to commit,1 we go from L to K, then from K to J. But J is a merge commit: it has two parents, not just one. So from J we go to both S and I. From those two, we go to both H and R, and on to G and Q, and F and P, and E. There are two ways to reach E but we only visit it once anyway. From here we can go on to O and D, N and C, M and B, and A. So that list of commits is the set of commits that is on master.

(Note that we cannot go from Q to W, though we can go the other way, from W to Q: all the arrows are one way, pointing backwards.)

The set of commits on branch-A starts with S and goes backwards to R, then Q, then P, then E and O and D and N and so on. All of these commits are also on master.

The set of commits on branch-B starts at Z, moves back through Ø and Å and Y and X and W, then picks up both P and V, and then E and O and U, and so on. There are thus four commits on branch-B that are not on any other branch.


1Technically, all of Git's internal arrows point backwards. So you've drawn your arrows backwards by drawing them forwards. 😀 This situation arises because commits are fully read-only: a parent can't know in advance what hash IDs its eventual children might have, but a child commit does know, in advance / at creation time, what parent hash IDs that child has. So the arrows must point backwards.

(In order to move forwards, you tell Git where you want to end up, and then it moves backwards from there, to make sure that it can end up there from some other starting point.)


How git merge works, greatly abbreviated

When you run:

git checkout master
git merge branch-B

Git must find the merge base commit of commits L and Z. To do that, it works backwards, as most Git operations do, from these branch tip commits, to find the best shared commit: a commit that is reachable from both branch tips, and hence on both branches, but is "closest to the tips" as it were. In this case, though it's perhaps not immediately obvious, that's commit Q. Start at L, go back through K to J and down to S and then R and then Q. Meanwhile, start at Z, go back to W and up to Q. Commits P, E, O, and so on are all also on both branches, but commit Q is "better" because it is the last such commit: it is a descendant of all of those commits.

Git will now run two git diff commands internally. This will compare the merge base—commit Q—vs the two branch tips:

git diff --find-renames <hash-of-Q> <hash-of-L>   # what we changed
git diff --find-renames <hash-of-Q> <hash-of-Z>   # what they changed

In these two diff listings, if some file appears to be newly-created in both branch tips, you will get an add/add conflict for that file. The file wasn't in Q, but it is in L and Z.

For files that are in all three commits, Git will attempt to combine any changes shown in the two sets of diffs. Where these changes do not overlap (and don't "touch at the edges" either), Git can combine them. Where they do overlap but those overlaps are exactly the same—cover the same original lines in Q, and make the same changes—Git can combine that by just taking one copy of the change. All other overlaps result in merge conflicts.

Your job at this point is to resolve any and all conflicts, any way you like. For instance, in the add/add whole-file conflicts, if the two files match, just pick either one's content. If not, combine the two files somehow. When you're done, write the final contents for each conflicted file into the index using git add. This marks the index conflict as "resolved", and you can now run git merge --continue or git commit to complete the merge.2


2git merge --continue checks to make sure you're finishing a merge. If not, it errors out. If you are, it just runs git commit—so there's no real difference either way, unless you aren't actually finishing a conflicted merge.

Upvotes: 4

Related Questions