Reputation: 1960

git pull and resolve conflict issue

I had a normal merge conflict (after a pull) in one file that I fixed. I changed the file, added it with git add and committed in order to push. I discover then that when I push, other 5 files where pushed with my changed file. What happened? if I check with gitk my commit have only that file, if I check instead in github web interface I see 6 files instead of 1 (so the other 5 files also that I downloaded when I pull but I didn't touch neither add). Locally if I do git show --stat commitID I see like webpage. Please, can anybody tell me why that merge added files and didn't merge silently? I thought the pull command merged them and then only the snapshot with my modification (1 file) would have been pushed. And why gitk give one thing and git show stat another ? (maybe some gitk params?)

Upvotes: 0

Answers (1)

torek

Reputation: 487725

To address what you've added in a comment:

My concern is...when I commit something, I commit files in the stage.

That's correct. But:

To add files in the stage I need to "git add" them.

This is not quite right. We'll come back to this in a moment.

When I pull (that is a fetch + merge), is there some part of merge in which files are added to the stage?

Yes, there is.

I can explain in that way. My fault is that I didn't do "git status" before adding my fix, but I thought the stage was empty.

Let's take a close look at this now. What exactly is this staging area? Note that it goes by three different names: the staging area, which is the name you have been using; the index; and sometimes the cache (so you'll see, for instance, git rm --cached and git diff --no-index or git diff --cached to control whether or not the index / cache / staging-area gets involved). I'll use the word index here, to avoid implying too much about it: the name staging area (which is in many ways better) implies something about how the index is used to stage files, while the name cache implies something about how the index is used to speed up various operations.

The way I like to define the index is that it is something you use to build up the next commit you will make. This refers to its role in staging files, but reminds you why you're staging files too: because you intend to make new commits. There's no point in making new ones if they are exactly the same as the old ones. So at some level, we know we're changing the data stored in the index, and then we'll use git commit to turn that into a new snapshot.

The first curious thing about the index is that it is never¹ empty. Instead, the index very commonly matches the current commit. It's at this point that you would have to run git commit --allow-empty to create a new commit. That's not because the index itself is empty; instead, it's because the difference between the index and the current commit is empty.

Later, though, you will git checkout some commit (perhaps by branch-name), or run git commit to make a new one. At this point, the index and the current commit will match each other. What's in the index is what's also committed.

When you run git add, what you are doing is copying the work-tree file into the index. If that's a new file, it's now in the index when it was not there before. If it's a changed file, what you just did was to replace the data stored in the index with new data. Either way, the index version now matches the work-tree version. (Exercise: does the index version now match the committed version? What happens if the work-tree version matches the committed version?)

The second curious thing about the index is what happens during git merge. There are multiple cases that git merge handles, and some of them are easy and don't do this; but one of them is a true merge. In a true merge, Git has two specific commits—I like to call them L and R, for left and right, or local and remote, or --ours and --theirs—that share a common merge base commit B. Git finds what happened since the common base by, in effect, running:

git diff --find-renames B L   # what did we change in --ours?
git diff --find-renames B R   # what did they change in --theirs?

Git will then combine the changes, by, in effect, filling up the index with three versions of each file: one from B, one from L, and one from R. These three go into "slots" within the index entry for each file. Normally, in the not-in-the-middle-of-a-merge case, only slot zero is occupied. When we're merging, slot zero is booted out and the three versions go into slots 1-3.

Git can then compare the three versions easily, and decide which one to take (if only one of --ours vs --theirs changed the file vs the base) or how to combine the changes (if both --ours and --theirs changed it). If Git can combine the changes—well, if it thinks it can—it does so, and tosses out these higher numbered slot entries entirely: the combined version goes into slot zero as usual, and that file is resolved. If Git doesn't think it can combine the changes, it writes its best guess, with conflict markers, to the work-tree, and then leaves the higher-numbered slots in place in the index.² You then have to do the resolving yourself (using the work-tree file, or the three index slot entries, or whatever it is you like), and git add the correct result, whatever that is. The git add step will then wipe out the slot 1-3 entries and write the slot zero entry.

At the end of all of this, we're back to the more normal case. The index now contains, in the normal slot zero (no conflicts) entry for each file, a copy of every file as it should appear in the new merge commit. So now we, or Git, can run git commit to make the commit with the final result.

If we are doing a true merge, and there are files in R that are in neither L nor in the merge base B, the merge result for that file is "take the new file". So this case is easy, and git merge adds the file to slot zero without ever really mentioning it.

¹Well, there is one obvious case where the index is totally empty, which is when there are no files at all. That's the initial state in a new, fresh, totally-empty repository, for instance. Note that in this case, you don't have any branches either. It is a peculiar state: you're on branch master but branch master does not exist yet! The index is truly empty, and there are no commits at all.

²This general description is not quite how it happens internally, where it is all optimized to make easy cases go fast. But it is the logical process. For the optimized process, see the git read-tree documentation.

Upvotes: 3

git pull and resolve conflict issue

Answers (1)

Related Questions