Reputation: 1041
With the following git history:
$ git log --oneline --graph --decorate
* d19c1fb (HEAD -> feature) Merge branch 'master' into feature
|\
| * 0a97b90 (master) remove d.txt
| * dc0227b append b to b.txt and create d.txt
| * a7536e4 Add a to txt (2)
* | e97dc11 append b to b.txt and create d.txt
|/
* 48d6625 Add a to a.txt
* 7ffa8cb Initial commit (a.txt, b.txt, c.txt)
dc0227b
is a cherry-pick of e97dc11
.
Why when the HEAD is located to feature, git revert 0a97b90
does not work? It is only outputting, nothing to commit, working tree clean
.
Some more context:
The branch feature added in one commit the file d.txt
and modified b.txt
. On master I need the modification to b.txt
, but I do not want the file d.txt
. So I did the following procedure:
Edit:
On the branch feature, ls does list the file d.txt
. Then I guess the surprising point is that merging master into feature did not actually delete d.txt
...
The original question changes a bit, but there is still definitely something I am missing here.
Upvotes: 0
Views: 69
Reputation: 487755
Consider the following history:
C--D--E <-- branch1
/
...--o--B
\
F--G--H <-- branch2
where each letter stands in for a commit. Newer commits are towards the right, so that E
comes after D
, and so on. (Side note: internally, Git works backwards: it starts from E
, then moves back to D
, then to C
and on to B
. Or, if starting at H
, Git starts there, moves back to G
, then to F
, and then to B
. This isn't directly relevant here; it's just useful in working with Git in general.)
You can now git checkout branch1; git merge branch2
or git checkout branch2; git merge branch1
. Either merge operation does more or less the same thing: the big difference is which branch name gets updated at the end, and one that's a little harder to describe.
In Git, every commit holds a snapshot. That is, commit B
—the one that's on both branches, and is the merge base of this upcoming merge operation—has a complete snapshot of all of the files that are in B
at all. The same is true for commits C
, D
, and E
, and for commits F
, G
, and H
. The only way to see what changed in some commit is to compare it to some previous commit.
For instance, we can pick commit C
out of the pile, and compare it to commit B
. If B
and C
have the same set of files, but one of the files in C
has different contents than that same file in B
, we must have changed that file in C
. So we'll often say that C
(or whoever made C
) "changed" the file—but in fact, C simply has the file. The change is only observable by comparing against B
.
If we compare B
vs F
, and F
has a file that B
does not have at all, we might say that whoever made F
added this new file. But in fact, F
just has files. We only get "added" by comparing to B
.
This same idea holds for D
and E
and G
and H
. To say *file f1.txt
changed in, say, H
, we have to pick some other commit first. Then we can compare commit ___ (fill in the blank) to commit H
. Which commit should we pick? (I bet you know which one to pick! But you do have to pick one.)
Many people expect Git to handle merge by looking at every commit. But it doesn't. Let's say we run:
git checkout branch2
so that we ask Git to start by filling in our index and work-tree from commit H
. That way, we can see all the files that are in the snapshot in H
. To remember which branch we're on, we'll update our drawing and attach the special name HEAD
, in all uppercase like this,1 to the name branch2
:
C--D--E <-- branch1
/
...--o--B
\
F--G--H <-- branch2 (HEAD)
In any case, now we'll run git merge branch1
. Git will use the name branch1
to find commit E
. The name points directly to commit E
, so that's easy. Then, Git will use the internal, backwards-pointing arrows connecting these commits (I've drawn them as lines instead of arrows because it's hard to draw good arrows on StackOverflow) to work backwards from both H
and E
and will find commit B
. This commit is the merge base of the merge.
These are now the three inputs to the merge operation:
B
;--ours
commit H
; and--theirs
commit E
.Git does not look at the intermediate commits.2 It simply does two straight comparisons:
git diff --find-renames hash-of-B hash-of-H
: this tells Git what we changed, ignoring commits F
and G
entirely, to turn the snapshot in B
into the snapshot in H
.git diff --find-renames hash-of-B hash-of-E
: this tells Git what they changed, ignoring commits C
and D
entirely, to turn the snapshot in B
into the snapshot in E
.Git now combines these two sets of changes into one larger pile of combined changes. Then it extracts the files from B
—not from E
or H
—and applies the combined changes to those files. Whatever comes out of this combined changes, that's the merge result.
If all goes well—if Git is able to combine the B
-vs-H
changes with the B
-vs-E
changes on its own—Git now makes a new commit from the result. The new commit has two parents, instead of the usual one. The first parent is the commit we're using right now, i.e., commit H
. The second parent is the commit we selected to merge, i.e., E
. Git then updates whichever branch we have checked out so that the name points to the new commit.
The result is this:
C--D--E <-- branch1
/ \
...--o--B I <-- branch2 (HEAD)
\ /
F--G--H
with the first parent of merge commit I
being H
, and the second parent of merge commit I
being E
. Merge commit I
has a snapshot, just like any commit. It doesn't have changes, just a snapshot.
We can ask Git to compare commit I
to some previous commit. Which previous commit do you choose? Remember, you can only choose one previous commit. You can run git diff hash1 hash2
or git diff hash branch2
, because the name branch2
selects commit I
now. But you pick one hash ID—the hash ID of B
, or C
, or E
or F
or whatever you like—and Git compare the snapshot in that commit to the snapshot in merge commit I
.
Pick any two commits and compare them, and you'll get a diff. The result of the diff clearly depends on which two commits you pick. When you have an ordinary non-merge commit, there's one obvious commit to pick—but with a merge commit, there are two obvious commits to pick, and you only get one at a time.3
1Frequently, on Windows and MacOS, you can get away with typing head
in lowercase. This is something of an accident of the implementation. It generally does not work at all in Linux, and it does not work correctly on these other systems if you start using git worktree add
—so try to avoid this habit. If typing HEAD
in all caps is annoying, consider using the special symbol @
, which Git internally translates into its own special name HEAD
.
(I find I typo HEAD
all the time as HAED
so I probably should use @
myself.)
2Even if it did, you would usually get the same result. The cases where you wouldn't get the same result are interesting, but mostly involve repeated name changes while files evolve over many commits. That's not what happens here.
The case of "add a file, but then delete it again" makes it pretty clear that the add step was a mistake and should be ignored. It is true that there's an "add" on one "leg" of the merge without a delete, and an add-and-delete on the other "leg" of the merge. But that just suggests that the mistake is only on the one side. The other side added and kept the file—so the merge should add and keep the file, and that's what Git ends up doing when it combines the changes.
Nonetheless, going commit-by-commit would at least give Git the ability to see the add-and-delete on the one particular leg. That would enable the algorithm to treat this specially, e.g., by declaring a conflict. But Git doesn't go commit-by-commit, so it can't see this at all!
3Technically, Git can pick all of the parents, producing what Git calls a combined diff. This is very different from the way Git does the merge, though. A combined diff, such as that produced by git show
of the merge commit, skips diffing files whenever the merge commit's copy of some file matches any of its parents' copy of that same file. Only if the merged copy is different from all parents will this combined diff show something, and even then, it will omit some of the differences.
What this means is that you often need to run two git diff
s to really inspect a merge commit. First, you diff the commit against its first parent, to see what changes came in from the --theirs
side of the merge. Then you diff the commit against its second parent, to see what changes came in from the --ours
side. The git show
command has the -m
flag to help do this automatically. A few files will have changes from both sides of the merge, and those are the only files that the combined diff might show at all.
Note that git log -p
normally does not even try to show a merge this way. Adding -m
, or -c
or --cc
, will make git log -p
show merges, using either the split-into-multiple-diffs method—-m
—or the combined diff algorithms, -c
and --cc
.
There are two different combined diff algorithms. The default for git show
is --cc
. I always got them mixed up until I used this as a mnemonic: one C = one hyphen; two Cs = two hyphens. What's the precise difference in their output? That, I still can't explain properly. I don't know what the Git authors had in mind here. I use -m
when I need diffs I can really use.
Upvotes: 1
Reputation: 60245
You don't say what's in d.txt
at the feature
tip, but from git revert
's behavior I think it's a pretty safe bet that d.txt
in e97d
and d.txt
in dc02
are identical. The revert found nothing to do because the revert should make d.txt
look like it does in dc02
. It already looks like that, so, nothing to commit. Everything still looks exactly as it did on checkout.
edit:
I guess the surprising point is that merging master into feature did not actually delete d.txt
Why should it? d.txt
doesn't exist at the merge base and doesn't exist at the master tip, grand total zero effect on d.txt
to merge from that history.
Upvotes: 1