Tachytaenius
Tachytaenius

Reputation: 187

How to remove all but last n commits from git history to save space

I use git to backup and restore folders that can change on my system and they can get really large as useless history accumulates. I only need the last 4 or so commits, how can I squash or delete the whole history except for the last n commits?

Upvotes: 4

Views: 1676

Answers (1)

TTT
TTT

Reputation: 29129

Disclaimer: Your first sentence may have unfortunately tainted this question:

I use git to backup and restore folders that can change on my system and they can get really large as useless history accumulates.

If the history is "useless", than perhaps Git isn't the correct tool to use. As mentioned in the comments, Git is generally not the right tool for a backup system. That being said, if your question didn't include the first sentence, we wouldn't have been able to pass judgement as to "why" you wish to do this, and your remaining question is certainly valid:

I only need the last 4 or so commits, how can I squash or delete the whole history except for the last n commits?

Here's one fairly straight-forward way to accomplish this. Note this method requires that you determine some things before you begin:

  1. The branch name you wish to rewrite. (This answer will assume it is called main.)
  2. The root commit of your branch. Get it with: git log main --reverse (This answer will assume it is called <old-repo-root-commit-id>.)
  3. The oldest commit ID you want to keep, e.g. your forth commit from the top, which will become your new repo root commit. (This answer will assume it is called <new-repo-root-commit-id>.)
  4. Your git status should be clean before you start. If it isn't, consider committing (or undoing) your most recent changes.

Here are the set of commands to run:

git switch --detach <new-repo-root-commit-id>
git reset --soft <old-repo-root-commit-id>
git commit --amend --reuse-message=<new-repo-root-commit-id>
git rebase <new-repo-root-commit-id> main --onto @

I see your title mentions your goal is to save space. In order for this to also achieve that goal you can't have any other refs leftover in your repo which point to your old commit IDs. If you only have a single branch in your system without any tags, then the old commits will eventually get garbage collected. If you want to clean it up right now, see this question.

Detailed explanation of how the commands work:

  • The switch command simply checks out the specific commit you want to be your new root commit. This is identical to git checkout <new-repo-root-commit-id> but the newer switch command requires you specify --detach when you are checking out a commit ID rather than a named branch.
  • The reset command says to change which commit you're currently pointing to, to instead be the root commit ID, and the --soft says to not change your local files, and also leave all the changes that would happen as a result of this reset, as staged and ready to be committed.
  • Note after the reset you are pointing to the very first commit in your repo, and now you are going to amend (rewrite) that commit to include all changes between the root and the new root commit. This is essentially just squashing all the previous commits into a single commit. The --reuse-message option says to use the commit message from the new repo root instead of the original repo root. This isn't actually needed, but it's more likely that the more recent commit message is better than the original one. This of course depends on what your commit messages say; when committing you can set a message from any other commit if you wish, or create a new different one too.
  • After the commit command you have exactly one commit, and now you will rebase (replay) the remaining (e.g. 3) commits that were on main, "onto" your new commit, leaving you with just 4 commits. At this point if you do a diff of main with the commit ID that main was on before this process, the results should be empty, because you didn't actually change your working files at all. (Note you can use git reflog to see what commit ID main was on before you did this, as long as you didn't perform a full garbage collection yet of reflogs and old commits.)

Upvotes: 6

Related Questions