git checkout --merge/--ours/--theirs seem to be doing the same (wrong?) thing?

Question

I'm trying to merge from another branch (it's an orphaned branch if that matters). However, when I do a:

git merge

It appears to merge correctly. However, if I do a:

git checkout --merge  --

Most if not all of the changes on the current branch get wiped out. It doesn't matter if I use --merge, --ours or --theirs, the results are the same.

I would have expected that the checkout when using the --merge flag would do the same thing as merge, except only for the files specified.

What's going on? Is there something I'm not understanding?

torek · Accepted Answer

TL;DR

See the git merge-file command, which does allow you to do what you want.

Long

The -m or --merge flag to git checkout has multiple different meanings.

When used with:

git checkout -m

it has the meaning you want, more or less; the problem is that it applies to all paths.

When used with:

git checkout -m [--]

it has a different meaning: it means that Git should, for each named path in , re-create merge conflicts in the work-tree copy of a file that has (or had) multiple higher-stage index entries.

There's a more fundamental issue here. Part of this is just tricky phraseology—we all say "changes in the work-tree", for instance—but another part lies in how to think about what Git does:

... most if not all of the changes on the current branch get wiped out

This suggests that you're thinking about what's in the work-tree copy of each file as changes, and that's not actually the case. Git doesn't store changes anywhere,¹ and the work-tree copies of files are largely just for you to use as needed: Git mostly uses snapshots, with files stored in what I like to call a freeze-dried format, in blob objects that are associated with commits, and in the index.

There is a notion of current branch and also current commit, but the branch is just a name (stored in HEAD), while the commit is a commit object, identified by its hash ID (stored in the branch name), permanent (mostly) and immutable (entirely). The commit contains—indirectly—a full snapshot of every source file. The index, which is also a crucial thing in Git, stores a snapshot as well, but unlike the commits, what's in the index is mutable.

Meanwhile, each commit stores the hash ID of some set of parent commits—usually exactly one such commit. When you have Git show you some commit, Git actually extracts all the files from both the parent and the commit itself,² then compares (all the files in) the two commits and shows you what's different. So when you look at a commit, it appears to have changes.

Git does the same trick with the index: it compares the current commit vs the index, showing you the differences and calling those changes staged for commit. Then it compares the index—which is essentially the snapshot that you're proposing would be the next commit, if you ran git commit right now—to the work-tree. Whatever is different between the index and work-tree, Git shows those differences, calling those changes not staged for commit. But in all three sets of files—committed files, files in the index, and files in the work-tree—what's actually there is not changes but rather snapshots.

What git checkout generally does—there are a bunch of exceptions because git checkout is really multiple different commands all crammed into one user-facing verb—is to extract files from the commit snapshot, writing those files into the index (so that the index and the commit match) and then writing the index copies to the work-tree (so that the index and work-tree match). But before doing any of that, it first checks to make sure that you won't lose any unsaved work, by comparing the current commit to the index, and the index to the work-tree: if these two don't match, there's something git checkout would clobber.

As soon as you use the git checkout -- mode, though, you're actually switching to an entirely different back-end operation. This operation starts not with a commit, but with the index. The files were, some time in the past, copied from a commit to the index, so the index has some set of files. That set may have been updated since the last normal checkout or hard reset or whatever: every git add means copy a file from the work-tree into the index, and if the work-tree file didn't match the index copy, well, now it does so the set of files in the index has changed. The index may even have non-zero stage entries, which represent ongoing merge conflicts from an incomplete git merge. In this case, the index essentially stores not one but three freeze-dried copies of some files, from the three inputs to an earlier git merge operation.³ But, one way or another, this kind of git checkout doesn't go back to a commit at all: it just takes files from the index and writes them, or for -m re-merges them, and clobbers whatever is in the work-tree. It does so without first asking whether that's OK.⁴

(Edit: there's also git checkout --patch, but this actually invokes a third mode. The patch operation, which compares two versions of a file and lets you select parts of this diff to apply to one of the two versions, is actually handled by a Perl program that runs git diff between the two versions. This implements git checkout --patch, git add --patch, git stash --patch, and git reset --patch.)

Anyway, the bottom line is that git checkout -m -- path does not do what you wanted. You can get what you want, but not using git checkout. Instead, what you need to do is extract the three input files you wanted to pass to git merge—put these three files anywhere; they need not even be in the work-tree for the repository itself—and then run the git merge-file command on them.

¹Well, except if you store the output of git diff, or, as a special case, each of the parts of a saved merge conflict from git rerere, but all of those are below the normal level of visibility.

²Due to the internal freeze-dried file format, Git doesn't actually have to bother extracting identical files, only those that differ in at least one bit.

³Technically, it's up to three entries per file. In cases such as a modify/delete conflict, you'll have just two entries for some file, for instance. Also, when you finish resolving a merge conflict and git add the file, the higher stage entries vanish. However, until you commit, those higher stage entries are stored in a secret, invisible index entry of type "REUC", specifically so that you can use git checkout -m to get the conflict back. There is no way to see or save this invisible entry, one of several flaws in the current index format.

⁴From a user-friendliness design perspective, this is particularly bad, because the other form of git checkout is very careful not to lose work.

git checkout --merge/--ours/--theirs seem to be doing the same (wrong?) thing?

Answers (1)

TL;DR

Long

Related Questions