Marcus Leon
Marcus Leon

Reputation: 56699

git checkout --ours when file spec includes deleted file

When we merge we keep the local version of our Maven pom.xml files:

git merge origin/remote_branch
git checkout --ours **/pom.xml pom.xml
git add **/pom.xml pom.xml
git commit -m "Merge"

This works great except if a pom.xml file has been removed in the local branch. After running command #2 above we get an error:

d:\code>git checkout --ours **/pom.xml pom.xml
error: path 'blah/pom.xml' does not have our version

... and after this error the next command #3 git add **/pom.xml pom.xml effectively adds the remote pom.xml files - exactly what we don't want.

How can we update our script to handle this?

Upvotes: 9

Views: 10578

Answers (2)

Gabriel Staples
Gabriel Staples

Reputation: 52945

How to solve the error error: path 'some/file' does not have our version after running command git checkout --ours **/some_file2.xml some_file2.xml.

1.A. As a human, here are the steps

As a human, you need to do the following. Let's assume you ran the following, as I explain and recommend here in Who is "us"/"ours" and "them"/"theirs" according to Git?:

git checkout --ours -- path/to/some/dir

...and it didn't work! It didn't do anything. Instead, it output these errors:

error: path 'path/to/some/dir/file1.cpp' does not have our version
error: path 'path/to/some/dir/file2.cpp' does not have our version
error: path 'path/to/some/dir/file3.cpp' does not have our version

The problem is that these are deleted files on the our side, so we must git rm each of them manually from our working tree (working file system), to manually force our working tree to match the our side for these files:

git rm path/to/some/dir/file1.cpp
git rm path/to/some/dir/file2.cpp
git rm path/to/some/dir/file3.cpp

# OR (same thing)
git rm path/to/some/dir/file1.cpp path/to/some/dir/file2.cpp \
path/to/some/dir/file3.cpp

Now, re-run your checkout --ours command and it will work just fine!:

git checkout --ours -- path/to/some/dir

Works! Done.

1.B. To script the above process, it's a little harder, but here is how

Let's script that stuff above. There are undoubtedly many ways to do this, but here's the easiest way I could find:

# 1. attempt to run `git checkout --ours` the first time,
# collecting any filenames which errored out, if any, and 
# `git rm` them all.
git checkout --ours -- path/to/some/dir \
|& gawk '{ print $3 }' | xargs git rm

# 2. Now run it again. If it worked the first time above already, 
# no big deal--running it again causes no problems. If it failed
# above though, the above command just ran `git rm` on all those
# failed files, so now this time it will succeed!
git checkout --ours -- path/to/some/dir

Done! You could also store the output from the first attempt into a file as well, of course, and only run the 2nd attempt if the first attempt failed (meaning the output is not an empty string), but I'll leave that up to you.

Sample output: By git rming your deleted files, you'll see the following output (the first line here contains the actual command after the $ char):

$ git checkout --ours -- path/to/some/dir |& gawk '{ print $3 }' | xargs git rm
path/to/some/dir/file1.cpp: needs merge
path/to/some/dir/file2.cpp: needs merge
path/to/some/dir/file3.cpp: needs merge
rm 'path/to/some/dir/file1.cpp'
rm 'path/to/some/dir/file2.cpp'
rm 'path/to/some/dir/file3.cpp'

Explanation of git checkout --ours -- path/to/some/dir |& gawk '{ print $3 }' | xargs git rm:

  1. git checkout --ours -- path/to/some/dir accepts all the merge conflicts from the --ours side (read more in my answer here: Who is "us" and who is "them" according to Git?).
  2. |& pipes both the stderr output as well as the stdout output, since the error messages that may be printed out by the git command are to stderr and that's what we need to pipe.
  3. gawk '{ print $3 }'prints only the 3rd space-separated field of each row, which means it captures the 'path/to/some/dir/file1.cpp' part of error: path 'path/to/some/dir/file1.cpp' does not have our version, for instance.
  4. | xargs git rm pipes all of those files to git rm to "git remove" them.

2. Finishing up

And now you can add these auto-fixed-up files and continue the process:

git add path/to/some/dir 
git status 

# Use the appropriate one of these based on whatever operation 
# you were in at the time when the conflicts happened.
git merge --continue 
git rebase --continue
git revert --continue
git cherry-pick --continue
# etc.

References:

  1. For awk/gawk:
    1. My git-diffn.sh "git diff with line numbers" script (I can never remember awk syntax so I just look at previous known examples, including my own).
    2. https://en.wikipedia.org/wiki/AWK
    3. Official GNU AWK user guide
  2. Using | xargs git rm: Git rm several files?
  3. Using |& to pipe both stdout and stderr: Piping both stdout and stderr in bash?
  4. Why use 'git rm' to remove a file instead of 'rm'?

Upvotes: 20

torek
torek

Reputation: 489045

First:

git merge origin/remote_branch

should probably read git merge --no-commit to make sure Git does not commit these changes if there are no merge conflicts, otherwise your next steps do not make much sense. Note that there will be no merge conflicts at all if the --theirs commit has changed some pom.xml files and you have not changed them, or if Git thinks it successfully merged your changes and theirs. (If you want to use theirs in one of these cases, that's also a bit tricky, but you seem to want to use the --ours versions always.)

Next:

git checkout --ours **/pom.xml pom.xml

This relies on your shell (presumably bash or similar) to expand ** the way you want; you might want to quote the asterisks, and make Git do the glob expansion. This could affect your particular case though, and I'm not sure how Git handles this during a merge conflict, so before you do anything like that, you would want to experiment carefully.

This works great except if a pom.xml file has been removed in the local branch. After running command #2 above we get an error:

d:\code>git checkout --ours **/pom.xml pom.xml
error: path 'blah/pom.xml' does not have our version

Right: for this case, if you want to keep the deleted file deleted, you need to override Git's default action of choosing to keep their version in the index and work-tree.

Let's jump into the Git-specific part of all of this, the index. Remember, Git's index is where you build the next commit you will make. During a merge, it's also where you resolve conflicts.

Entries in the index during a merge

In the normal (non-merging) cases, the index has one entry for every tracked file. If file F is in the current (HEAD) commit and the work-tree, the index has an entry for F. Initially this index entry version matches the HEAD version. You modify the file in the work-tree, then git add the work-tree version to copy it into the index in place o the HEAD version; and then the next git commit will save the index version.

During a conflicted merge, where file F has a conflict, the index has up to three entries for F instead of the usual one. These entries go in slots number 1, 2, and 3. (Slot zero is reserved for the normal, not-conflicted entry.) Slot 1 is for the merge base version. Slot 2 is for --ours, and slot 3 is for --theirs, and you can just use these names for 2 and 3, but there's no name for slot 1.

A merge conflict occurs when:

  • the same line(s) were modified in ours and theirs, with respect to the base version (this is a modify/modify conflict), or
  • there is no base version, just ours and theirs (this a create/create conflict), or
  • we removed the file and they changed something, even just the name (this is a delete/modify or delete/rename conflict), or
  • they removed the file and we changed something: this is also a modify/delete or rename/delete conflict, with the partners swapped around.

For the modify/modify conflict, all three slots are populated. For the other three types of conflict, one slot is empty: the merge base slot is empty (create/create), or --ours is empty (delete/X), or --theirs is empty (X/delete).

The git checkout --ours step fails when the --ours slot is empty. It succeeds when the --ours slot is non-empty: it extracts the --ours version into the work-tree.

Git's default action on any delete/X or X/delete conflict is to leave, in the work-tree, whichever version survived. That is, if it's slot 3 (theirs) that's empty, the work-tree file matches the slot 2 entry, but if it's slot 2 (ours) that's empty, the work-tree file matches the slot 3 entry.

You could choose to handle this by scanning for empty "slot 2"s and git rming the file for this case:

git ls-files --stage | fancy-script-or-program

If you write this as, say, a Python program, use git ls-files -z --stage to make it easily machine-parseable. You could even stop using git checkout --ours at all, and stop depending on shell or Git globbing, and code the rules for resolving pom.xml files entirely in the script.

Essentially, you might read through the entire index, looking for files whose base-name (everything after the final /) matches pom.xml:

  • If there is a stage-zero entry, Git thinks it resolved the file correctly. Compare the hash ID with the one in the HEAD commit, because Git may not have actually resolved the file correctly after all; in this case, replace the index blob hash with the one from the HEAD commit. See the git update-index documentation for details. You should be able to use --cacheinfo, although I have not tested this with unmerged index entries.

  • Otherwise, there are stage 1, 2, and/or 3 entries. If there is a stage 2 entry, use it as the resolution, i.e., feed it to git update-index as above. If there is no stage 2 entry, use git update-index to remove the entries (using 0 for the mode, and anything, including the all-zeros hash, for the hash; the hash is irrelevant if the mode is 0).

Once you have done this with all the pom.xml paths, any remaining non-zero stage index entries indicate a merge conflict you should pass back to your user. Otherwise, you may be ready to commit.

(A quick scan of http://gitpython.readthedocs.io/en/stable/reference.html#module-git.index.base suggests that this could be done fairly easily in GitPython, but I have no experience with using it.)

Final caveat: I have no experience at all with Maven, but I gather that pom.xml files are XML files that control various things and that Git merges poorly (the last is true of pretty much all XML files). It's not at all clear to me that just using the "ours" version is correct, though.

Upvotes: 1

Related Questions