Hexdoll
Hexdoll

Reputation: 2146

Record file copy operation with Git

When I move a file in git using git-mv the status shows that the file has been renamed and even if I alter some portions it still considers to be almost the same thing (which is good because it lets me follow the history of it).

When I copy a file the original file has some history I'd like to associate with the new copy.

I have tried moving the file then trying to re-checkout in the original location - once moved git won't let me checkout the original location.

I have tried doing a filesystem copy and then adding the file - git lists it as a new file.

Is there any way to make git record a file copy operation in a similar way to how it records a file rename/move where the history can be traced back to the original file?

Upvotes: 196

Views: 60318

Answers (3)

Robert Pollak
Robert Pollak

Reputation: 4165

If for some reason (e.g. using gitk) you cannot turn on copy detection as in Jakub Narębski's answer, you can force Git to detect the history of the copied file in three commits:

  • Instead of copying, switch to a new branch and move the file to its new location there.
  • Re-add the original file there.
  • Merge the new branch to the original branch with the no-fast-forward option --no-ff.

Credits to Raymond Chen. What follows is his procedure. Say the file is named OriginalFileName.cpp, and you want the duplicate to be named DuplicateFileName.cpp:

fileOriginal=OriginalFileName.cpp
fileDuplicate=DuplicateFileName.cpp
branchName=duplicate-OriginalFileName

echo "$fileOriginal, $fileDuplicate, $branchName" # review of defined names

git checkout -b $branchName # create and switch to branch

git mv $fileOriginal $fileDuplicate # make the duplicate
git commit -m "Duplicate $fileOriginal to $fileDuplicate"

git checkout HEAD~ $fileOriginal # bring back the original
git commit -m "Restore duplicated $fileOriginal"

git checkout - # switch back to source branch
git merge --no-ff $branchName -m "Merge branch $branchName" # merge dup into source branch

Note that this can be executed on Windows in Git Bash.


2020-05-19: The above solution has the advantages of not changing the log of the original file, not creating a merge conflict, and being shorter. The former solution had four commits:

  • Instead of copying, switch to a new branch and move the file to its new location there.
  • Switch to the original branch and rename the file.
  • Merge the new branch into the original branch, resolving the trivial conflict by keeping both files.
  • Restore the original filename in a separate commit.

(Solution taken from https://stackoverflow.com/a/44036771/1389680.)

Upvotes: 153

zedd45
zedd45

Reputation: 2171

This builds on the answer from Robert.

For my use case, I needed to move several directories from one implementation to another (with all that entails for file include paths, unit tests, etc), and I found it challenging & time consuming to move each individual file.

My solution includes prompts for the the origin & destination paths.

My solution also deletes the temporary branch that was created for this purpose (if the script succeeds to the end).

Caveats:

  1. The script will attempt to make a new directory for the input you provide for the second prompt (the new destination).
  2. Both this and the original solution merge history into the CURRENT BRANCH. I suggest that you start with a new branch, or at least git stash save if you have any local modifications.
branchName=chore/temp/duplicate-file-history-by-script
currentBranchName="$(git branch --show-current)"

function copy_git_history() {
    targetToCopy=$1
    newDestination=$2

    echo "copying $targetToCopy to $newDestination and restoring it's history"

    git mv "$targetToCopy" "$newDestination"
    git commit -m "duplicating $targetToCopy to $newDestination to retain git history"

    git checkout HEAD~ "$targetToCopy"
    git commit -m "restoring moved file $targetToCopy to its original location"
}

### USER PROMPTS ###

echo "proceeding to copy files to current branch.  Please make sure you are prepared to have the current git branch modified: $currentBranchName"
# spacing to make things easier to read
printf "\n"

echo "Please enter the path to the file(s) you wish to duplicate, relative to $PWD"
read -r originalFileLoc

echo "Please enter the new path where you wish to copy the original file(s)"
read -r newFileLoc

### END: USER PROMPTS ###

# create the new branch to store the changes
git checkout -b $branchName

# create the duplicate file(s)
if [[ -d  "$originalFileLoc" ]]
then
    files="$originalFileLoc/*"
    echo "copying files from $originalFileLoc to $newFileLoc"
    mkdir -p "$newFileLoc"

    for file in $files
    do
      copy_git_history "$file" "$newFileLoc"
    done
else
  copy_git_history "$originalFileLoc" "$newFileLoc"
fi

# switch back to source branch
git checkout -
# merge the history back into the source branch to retain both copies
git merge --no-ff $branchName -m "Merging file history for copying $originalFileLoc to $newFileLoc"

# delete the branch we created for history tracking purposes
git branch -D $branchName

Upvotes: 2

Jakub Narębski
Jakub Narębski

Reputation: 323792

Git does not do rename tracking nor copy tracking, which means it doesn't record renames or copies. What it does instead is rename and copy detection. You can request rename detection in git diff (and git show) by using the -M option, you can request additional copy detection in changed files by using the -C option instead, and you can request more expensive copy detection among all files with -C -C. See the git-diff manpage.

-C -C implies -C, and -C implies -M.

-M is a shortcut for --find-renames, -C means --find-copies and -C -C can also be spelled out as --find-copies-harder.

You can also configure git to always do rename detection by setting diff.renames to a boolean true value (e.g. true or 1), and you can request git to do copy detection too by setting it to copy or copies. See the git-config manpage.

Check also the -l option to git diff and the related config variable diff.renameLimit.


Note that git log <pathspec> works differently in Git: here <pathspec> is set of path delimiters, where path can be a (sub)directory name. It filters and simplifies history before rename and copy detection comes into play. If you want to follow renames and copies, use git log --follow <filename> (which currently is a bit limited, and works only for a single file).

Upvotes: 128

Related Questions