Hedley
Hedley

Reputation: 1092

Git merge fails when file has been moved in both branches

I'm performing a git merge where a number of files have been moved, in the same way, in both branches of the merge. To my surprise, for about 10 files git has failed to find the version of the file in the branch I am merging from.

i.e. file starts off in /path/file.txt. In branch 1 file is modified, then moved to /path/newpath/file.txt. In branch 2 file is moved to the same path /path/newpath/file.txt. When I merge I would expect git to be able to deal with this. However, it shows this as a merge conflict, saying the files have been deleted from branch 1. I have three questions:

  1. Why does git fail to find these files? Naively I would expect git to simply say "get the version of this file path in the merge branch", but it presumably isn't doing this? It looks to me like the files involved have altered quite a lot, which leads me to suspect that git is comparing the file contents to establish if they are merge candidates, as suggested by this SO question : git merge with renamed files But surely files with the same path and name should be considered as merge candidates, even if their contents have changed quite a bit?
  2. Is there some config I can change to tell git to consider files with the same name and path as merge candidates? The SO question I referenced suggested that you can set the file similarity percentage, but this seems an oblique way to achieve what I want.
  3. I can make this merge work by either manual or scripted update of my git index, to tell git the object hash of the file in the other branch. As per the script on the other SO question, it would look something like this:

    FILE_PATH="path/to/file.txt"

    git update-index --index-info <<EOI 000000 0000000000000000000000000000000000000000 0 $FILE_PATH 100644 e87d02f423c3a66da62ddc10b359314b34a556e3 2 $FILE_PATH 100644 0ddb2a448cb9cca97834df78ae00e213ecd9dd71 3 $FILE_PATH EOI

If I do so, do I need to also tell git about the common ancestor object? i.e. use update-index to put a entry for stage 1, in addition to the records for stages 2 and 3? The reason I'm confused is because in this specific case, the common ancestor will have a different path to both of the other versions, so even if I was to update the index to include it, how would that tell git that that was the ancestor!? What would tie that stage 1 entry to the stage 2 and 3 entries, given that the file path would be different?

Upvotes: 3

Views: 249

Answers (1)

torek
torek

Reputation: 488183

The index holds what you intend to commit, so the path in the index should be whatever will be in the merge commit. However, there are multiple paths here, so see answer to #3 below.

Answering your questions in order:

  1. "surely files with the same path and name should be considered as merge candidates": not necessarily, because of rename detection.

    Git starts out the merge process by identifying the merge base (an actual commit, or, in some rare cases using the "recursive merge" trick, a virtual commit made by doing another merge), then diffs the merge base—the entire tree—against the two commits in question (HEAD, the commit/branch you're on when you run git merge, and MERGE_HEAD, the commit you're asking to merge-in). This produces two separate diffs, each of which does its own separate rename-detecting.

    In this particular case, path_a was renamed to path_b in both branches, but only one rename was actually detected. (If neither rename had been detected you'd have a "create/create conflict" where git thinks path_b is independently created in both branches. If both identical renames had been detected, git would just merge "from path_a in base, to path_b in both HEAD and MERGE_HEAD. So we can conclude that git only half-succeeded here.)

    Since git failed to notice that path_b in one of the two diffs was actually renamed from path_a, it decides that in that diff, you simply removed path_a and wrote a completely different path_b. It can't merge any changes because path_b in HEAD is not related to path_b in MERGE_HEAD.

    The index-update question you linked to shows how to tell git that, no, in fact both path_bs are related and the merge-base version is simply named path_a in both cases. Then the final "git checkout -m" creates the appropriate merge version (perhaps with merge conflicts).

  2. The -X rename-threshold option is all there is for this, for now. Following the links in the question you linked to gets you that script to tell git "ex post facto", as it were, about any rename(s) it missed.

  3. This is where I'd have to experiment a bit. The index entry does tell git the SHA-1 of the common ancestor and that goes in "index slot 1". The path that goes with it is, in one sense, entirely irrelevant: to do the merge, git needs only the file contents (for all three versions: base aka stage 1, HEAD aka stage 2, and MERGE_HEAD aka stage 3). It seems likely, both based on the index format and on the script (https://gist.github.com/tvogel/894374), that the path you need here is not the original path but rather the new commit's path—but I don't know that for sure.

Upvotes: 1

Related Questions