Reputation: 422
I have a repo with lots of files that are no longer in the working directory- files that have been added and removed over the months/years of the repository.
I would like to make a file with a list of all these files that are stored in the commit histories but no longer required, including their locations.. i.e.
/web/scripts/index.php
/sql/tables.sql
...
Then I would like a command that runs through that file and removes the files referenced in it from the commit history completely, something like git rm --cached
does but for a list of files.
Upvotes: 4
Views: 362
Reputation: 2666
Adding onto @David's answer, if you want to be extra careful and make sure you aren't deleting any files that were subsequently added later on in the history, use the following block of commands instead of the git delete $(git log --all --pretty=format: --name-only --diff-filter=D)
(consider adding this as a function in your .bashrc
):
current=($(git ls-files))
tracked=($(git log --all --pretty=format: --name-only --diff-filter=D | xargs))
deleted=()
resurrected=()
for file in "${tracked[@]}"; do
if [[ " ${current[@]} " =~ " $file " ]]; then
resurrected+=("$file")
else
deleted+=("$file");
fi
done
echo "Deleted: ${deleted[@]}"
echo "Resurrected: ${resurrected[@]}"
git delete "${deleted[@]}"
Upvotes: 0
Reputation: 17343
Alias David Underhill's script, then run (with caution):
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
David Underhill's command uses filter-branch
to modify the history of your repository, removing all history of a given file path.
The script, in its entirety (source):
#!/bin/bash
set -o errexit
# Author: David Underhill
# Script to permanently delete files/folders from your git repository. To use
# it, cd to your repository's root and then run the script with a list of paths
# you want to delete, e.g., git-delete-history path1 path2
if [ $# -eq 0 ]; then
exit 0
fi
# make sure we're at the root of git repo
if [ ! -d .git ]; then
echo "Error: must run this script from the root of a git repository"
exit 1
fi
# remove all paths passed as arguments from the history of the repo
files=$@
git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch $files" HEAD
# remove the temporary history git-filter-branch otherwise leaves behind for a long time
rm -rf .git/refs/original/ && git reflog expire --all && git gc --aggressive --prune
Save this script to a location on your hard drive (e.g. /path/to/deletion_script.sh
), and make sure it's executable (chmod +x /path/to/deletion_script.sh
).
Then alias the command:
$ git config --global alias.delete '!/path/to/deletion_script.sh'
To get a sorted list of all deleted files:
$ git log --all --pretty=format: --name-only --diff-filter=D | sort -u
With a list of deleted files, it's just a matter of hooking up git delete
to process each file in the list:
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
Make a dummy repository with additions, renamings, and deletions:
mkdir test_repo
cd test_repo/
git init
echo "Dummy content" >> stays.txt
git add stays.txt && git commit -m "First file, will stay"
echo "Rename content" >> will_rename.txt
git add will_rename.txt && git commit -m "Going to rename"
echo "Delete this file" >> will_delete.txt
git add will_delete.txt && git commit -m "Delete this file"
git mv will_rename.txt renamed.txt && git commit -m "File renamed"
git rm will_delete.txt && git commit -m "File deleted"
Inspect the history:
$ git whatchanged --oneline
d768c58 File deleted
:100644 000000 7a4187c... 0000000... D will_delete.txt
96aadf0 File renamed
:000000 100644 0000000... 94a12c7... A renamed.txt
:100644 000000 94a12c7... 0000000... D will_rename.txt
3ba05fa Delete this file
:000000 100644 0000000... 7a4187c... A will_delete.txt
c88850a Going to rename
:000000 100644 0000000... 94a12c7... A will_rename.txt
6db6015 First file, will stay
:000000 100644 0000000... f3ae800... A stays.txt
Delete old files:
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
Rewrite 8c2009db5ac05b27cd065482da94dec717f5ef4a (8/9)rm 'will_delete.txt'
Rewrite e1348d588597f2f6dd63cade081e0fbdf8692c74 (9/9)
Ref 'refs/heads/master' was rewritten
Counting objects: 27, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (22/22), done.
Writing objects: 100% (27/27), done.
Total 27 (delta 12), reused 10 (delta 0)
Inspect the repository now. Notice that the deletions have been removed from the history, and renamings appear as if the file was added initially that way.
c800020 File renamed
:000000 100644 0000000... 94a12c7... A renamed.txt
0a729d7 First file, will stay
:000000 100644 0000000... f3ae800... A stays.txt
Upvotes: 3