Jürg W. Spaak
Jürg W. Spaak

Reputation: 2149

Remove a file from a repository permanently, git

I have a file "npz, species_coex.npz", that was added to my git repository by mistake. After realizing my mistake I removed it with git rm. Now I found out, that git still knows about it (which is usually fine, but I want git to forget completely about it, as if it was never added in the first place).

I've read about the filter-branch command, but would like to not use it, because of all the warnings about it, if this is not possible tell me.

I've read this, they recommend:

$ git filter-branch --tree-filter 'rm -f "npz, species_coex.npz" ' HEAD

I get the error:

fatal: ambiguous argument 'npz, species_coex.npz': unknown revision or path not in the working tree

I'm not sure, why this problem occurs, because of the blank space (which I guess not, as I put it into quotations) or because the file is not in the current head? How can I tell him where this file is to be found?

And is there a way, how I can do this without a filter branch? I only added the file once and then removed it, so it's history is quite simple

Upvotes: 0

Views: 733

Answers (1)

Vampire
Vampire

Reputation: 38734

The "problem" with filter-branch is the same as with any command that modifies the history of already pushed commits. If someone else already got this commit and has a branch based on it, he will have to manually fix his history (i. e. every other one manually) like described in the help of git rebase under the heading RECOVERING FROM UPSTREAM REBASE.

If you want to purge the file from the history, because it e. g. contains confidential information like passwords, you have no other chance than to modify the history, no matter which tool you use for this, be it git rebase -i, git filter-branch or the tool called BFG.

With filter-branch you should not use the --tree-filter, as it needs a full worktree for each commit. This is necessary if you want to add or change some files. If it is only about deleting files, you should use the --index-filter instead and only operate on the index rather than on the worktree that will not be available. Your filter command will then be something like --index-filter 'git rm --cached --ignore-unmatch "npz, species_coex.npz"'.

The error you got with your try implies that you did not use rm ... but git rm ... in your filter command, but without the --ignore-unmatch which tells git to ignore it if you try to delete a non-existing file, similar to what -f amongst other things does for the normal rm utility.

If you added the file some few commits back, it might be easier and faster to use an interactive rebase though. Just do git rebase -i <the commit before the one that added the file>, then in the editor change the pick stanza to edit for the commit that added the file and quit the editor. When Git stops, delete the file from the current commit like git rm 'npz, species_coex.npz' && git commit --amend -C HEAD and continue the rebasing with git rebase --continue. After Git is finished then, you should have a new version of your history without the file.

Upvotes: 3

Related Questions