Reputation: 19315
I tried looking for a good tutorial on reducing git repository sizes, but I found none.
How do I reduce my repository size?
It's about 10 MB, but Heroku only allows 50 MB
and I'm nowhere near finished developing my application.
I added the usual suspects (log, vendor, doc, etc.) to the .gitignore file already.
Although I only added .gitignore recently.
What can I do?
Upvotes: 415
Views: 252944
Reputation: 19315
Here's what I did:
git gc
git gc --aggressive
git prune
That seemed to have done the trick. I started with around 10.5 MB and now it's little more than 980 KB.
You can run all three commands with prune till now using:
git gc --aggressive --prune=now
Documentation:
Upvotes: 135
Reputation: 31760
In my case, I pushed several big (more than 100 MB) files and then proceeded to remove them. But they were still in the history of my repository, so I had to remove them from it as well.
This did the trick:
bfg -b 100M # To remove all blobs from history, whose size is superior to 100MB
git reflog expire --expire=now --all
git gc --prune=now --aggressive
Then, you need to push force on your branch:
git push origin <your_branch_name> --force
Note: bfg is a tool that can be installed on Linux and macOS using Homebrew (executable brew
):
brew install bfg
Upvotes: 34
Reputation: 1328182
Update Feb. 2021, eleven years later: the new git maintenance
command (man page) should supersede git gc
, and can be scheduled.
Original: git gc --aggressive
is one way to force the prune process to take place (to be sure: git gc --aggressive --prune=now
). You have other commands to clean the repo too. Don't forget though, sometimes git gc
alone can increase the size of the repo!
It can be also used after a filter-branch
, to mark some directories to be removed from the history (with a further gain of space); see here. But that means nobody is pulling from your public repo. filter-branch
can keep backup refs in .git/refs/original
, so that directory can be cleaned too.
Finally, as mentioned in this comment and this question; cleaning the reflog can help:
git reflog expire --all --expire=now
git gc --prune=now --aggressive
An even more complete, and possibly dangerous, solution is to remove unused objects from a git repository
Note that git filter-repo
now (Git 2.24+, Q4 2019) replaces the obsolete git filter-branch
or BFG: it is a python-based tool, to be installed first.
# Find the largest files in .git:
git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)
# Strat filtering these large files:
git filter-repo --path-glob '../../src/../..' --invert-paths --force
#or
git filter-repo --path-glob '*.zip' --invert-paths --force
#or
git filter-repo --path-glob '*.a' --invert-paths --force
git remote add origin [email protected]:.../...git
git push --all --force
git push --tags --force
Upvotes: 486
Reputation: 4298
This should not affect everyone, but one of the semi-hidden reasons of the repository size being large could be Git submodules.
You might have added one or more submodules, but stopped using it at some time, and some files remained in .git/modules
directory. To give redundant submodule files away, see this question.
However, just like the main repository, the other way is to navigate to the submodule directory in .git/modules
, and do, for example, git gc --aggressive --prune
.
These should have a good impact on the repository size, but as long as you use Git submodules, e.g. especially with large libraries, your repository size should not change drastically.
Upvotes: 3