Adam Parkin
Adam Parkin

Reputation: 18680

Git permanent removal of file not resulting in smaller repo?

I have a repo that (at start) was 5.6G in size:

aparkin@mymachine ~/repo (master)
$ du -d 0 -h
5.6G    .

However, this repo contained a number of large binary files that no longer needed to be in the repo. Originally they were in various locations in the directory structure, but all named "tc.dat". As a "cleanup" step, I created a cruft directory, and git mv all of them into this cruft directory, changing their names to tc.dat1, tc.dat2, etc.

I then had 5 of these files, tc.dat1 through tc.dat5

I then followed this question, and used filter-branch along with the cleanup steps to remove all instances of these files in the cruft directory. However, this still left the original filenames (before the move into cruft) in the repo. I then repeated the step removing them from their original locations across all commits, and again did the cleanup steps:

rm -rf .git/refs/original/ && git reflog expire --all &&  git gc --aggressive --prune 

After all this, if I do a

git log --all -- tc*.dat

I see no matches in my history, indicating to me that they are completely removed. However, when I again do a du the repo is still 5.6G in size. Given these files comprise about 0.5GB, I'd expect to see that number go down.

What am I missing?

Upvotes: 2

Views: 226

Answers (1)

Adam Parkin
Adam Parkin

Reputation: 18680

Ok, there were a few things I was missing.

Following the tips at Git pull error: unable to create temporary sha1 filename I tried some of the commands and did:

$ git-prune
$ git-prune-packed
$ du -h -d 0
5.2G

That's about 0.4G down, which is about the size of the files I wanted gone. I also noticed in reading a few other questions and the man pages for git-reflog and git-gc that my usage of reflog expire and --aggressive --prune was incorrect. Both take arguments as to how far back in history to go, and in both cases I want all history so the now parameter is needed:

$ rm -rf .git/refs/original/
$ git reflog expire --all --expire=now
$ git gc --aggressive --prune=now
$ du -h -d 0
4.5G    .

A rather significant savings (1.1G) over what I started at.

Upvotes: 3

Related Questions