Number945
Number945

Reputation: 4940

When are 2 patch considered equal in git?

My question arose when I read in git rebase doc , that

If the upstream branch already contains a change you have made (e.g., because you mailed a patch which was applied upstream), then that commit will be skipped. For example, running git rebase master on the following history (in which A' and A introduce the same set of changes, but have different committer information):

      A---B---C topic
     /
D---E---A'---F master will result in:

               B'---C' topic
              /
D---E---A'---F master

One way is to see the patch Id using git patch-id , but that is not what I want.

Let me have 2 branches. Topic and master and I am changing only one file in it.

Inserted 2  ->  T2     M2 <--  Inserted 2 in new line
                |      |       
Inserted 1  ->  T1     M3 <-- Inserted 3 in new line
                  \   /
                   \ /
                    * <--  Contents similar here 

Now at T2 and M2 , patch is not considered same though we are adding 2 in the same new line in both versions of the file (Found this was git patch-id). This finding was surprising for me. I thought patch will be same if same contents on same line is applied in 2 different versions of a file.

This made me think that patch, hence do depends on the previous commit too, where I am applying patch. So, when we say (patch1 on some branch) = (patch2 on some other branch) , then their ancestors also need to be same ? If yes, we can recursively apply this and 2 branches will come out to be identical which is illogical.

So, my question is , when do we say , 2 patches equal (not considering the patch-id) ?

Use this script to reproduce the above in local:

#!/bin/bash

git init .
echo "10" >> 1.txt && git add . && git commit -m "1"

# Add 2 commits to master
echo "3" >> 1.txt && git commit -am "m3"
echo "2" >> 1.txt && git commit -am "m2"


#checkout topic branch
git checkout -b topic HEAD~2
echo "1" >> 1.txt && git commit -am "t1"
echo "2" >> 1.txt && git commit -am "t2"

#Show graph
git log --oneline --all --decorate --graph

Upvotes: 3

Views: 215

Answers (1)

torek
torek

Reputation: 488233

So, when we say (patch1 on some branch) = (patch2 on some other branch) , then their ancestors also need to be same?

Not for git rebase, no. Rebase uses the same computation as git patch-id, which is nominally a result of hashing the stripped-down (line numbers and whitespace removed) diff text.1

The git rev-list command also does this. See its --left-right, --right-only, --cherry-mark, and --cherry-pick options, which must be used with the symmetric difference three-dot notation commit selectors.

In fact, git rebase uses git rev-list to do the work. In the old days, when git rebase was mostly shell scripts, it was easy to see how this was done. Now it's all built as C code, so instead of running git rev-list, it has the same bits of git rev-list compiled in.

... thought patch will be same if same contents on same line ...

No, the line numbers are removed. This is on purpose: a patch might, for instance, be as simple as replacing a call that passes false with one that passes true, which to Git is:

-    foo(false)
+    foo(true)

(with, in the case of git diff, some surrounding context—it's not clear whether the patch-ID includes the context, but I would assume that it does). Suppose this fix is accepted upstream, while you're working on a feature that may or may not be related to the fix ... but upstream, that call to foo, which was on line 42, is now on line 47 because five unrelated lines were added well above this point?

Rebase should, and does, omit this patch now that it exists in the upstream to which you are rebasing, as determined by doing a --left-right pass over the symmetric difference of the upstream argument to rebase, and HEAD. All the left-side commits have their patch IDs calculated. All the right-side commits have their patch-IDs calculated. If the patch IDs match, the commit is considered a duplicate, and elided from the set of commits to copy.


1In Git 2.39, the patch ID computation code has changed, partly to fix some bugs and partly to allow retaining indentation-related white space. See the new --verbatim option in particular, and the detail in this answer from VonC.

Upvotes: 4

Related Questions