Elliot Blackburn
Elliot Blackburn

Reputation: 4164

Trying to remove files from a super messy repository but they don't seem to be removed

So we've gone back to and old project to do some updates, now in the past no one has mentioned how messy the repository is. It's a very old one that's got many many commits and it seems it hasn't been very well managed.

Long story short the .git file is now 4.02Gb (yes that's right, Gb) in size. I'm trying to blast through and remove all of the old files that should never have been tracked in the first places (I can see some .ipa's and .swfs instantly that don't need to be there).

I've used a little shell script that atlassian recommend on their Maintaining a Git Repository which has listed out the top 10 offenders for me which is very helpful but I'm having a tough time removing the files from the history.

I've tried running git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch <file/dir>' HEAD as everyone seems to suggest and then running a garbage collection. I've tried not to use an aggressive GC because it's so big that takes a few hours to run. However after running a gc with: git gc --prune=now it doesn't seem to be having any effect and the same files come up when I run my script to give me the list of biggest files.

What command(s) should I be using to remove a file previously committed from all of the history in my repository to help reduce the size?

Upvotes: 1

Views: 117

Answers (2)

Andrew C
Andrew C

Reputation: 14823

You are missing these two steps from the Checklist for Shrinking a repository

Remove the original refs backed up by git-filter-branch: say git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d.

Expire all reflogs with git reflog expire --expire=now --all.

Also, I would avoid git gc --aggressive instead use

git repack -ad --depth=250 --window=250

Upvotes: 1

Useless
Useless

Reputation: 67713

You re-wrote whichever branch you're currently on (HEAD). There could be other refs keeping the old commits alive - either other branches, or tags on the old commits.

And, of course, until you forcibly push your new copy of the branch, the old remote (eg. origin/master) will still be keeping those old commits alive.

Upvotes: 1

Related Questions