Reputation: 1272
I'm trying to clean up a git repository of latex code that contains the generated pdf files, because these files have caused the repo to balloon up to a size of 300mb.
Adapting a bit from the answer here How to remove file from Git history?. I tried the following command:
git filter-branch -f --index-filter 'git rm --cached --ignore-unmatch *.pdf' HEAD
This reduced the size a little, but not as much as i'd hoped. When I then try the script found in the answer to this question: How to find/identify large commits in git history?, to find which files contribute to the size, it still shows several pdf files. However, if i try the script found in this question: Which commit has this blob?, it cannot find any commit that contains the file.
I have removed all branches except the local branch. I have not pushed the changes to the remote.
Is there any reason these files would still persist in the history somewhere? What other things can I try?
Upvotes: 0
Views: 194
Reputation: 51850
You may have blobs still present just because the garbage collector didn't collect them.
Try cloning your local repo, and check the size of the .git/
directory in that new clone :
git clone myrepodir myclone
cd myclone
du -sh .git
# you can then remove that clone :
cd ..
rm -rf myclone
This will be a more acurate view of how much data would be pushed or cloned.
If you are 100% positive the content after your filter-branch
action is the content you want to keep, and if you don't mind loosing your reflog (no more undos, drops all your stashes) : you can run
git gc --aggressive --prune=now
See also git help gc
for more details on what could be retained on your disk.
Upvotes: 1