Reputation: 1585
I'm trying to use the git filter-branch
feature to remove a file that was recently updated and committed. I tried running the following command:
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch myfile' --prune-empty --tag-name-filter cat -- 6f7fda9..HEAD
However this only removes the file from the master branch, and I want it removed from all branches.
Starting with commit 6f7fda9
to HEAD
I want the file removed. Is the command I'm running wrong?
Upvotes: 6
Views: 3630
Reputation: 45659
Your requirements as stated are contradictory. Specifically
I want it removed from all branches.
and
Starting with commit 6f7fda9 to HEAD I want the file removed.
need to be reconciled. I suspect this comes down to an inaccurate understanding of commit ranges - which are only sort-of a thing in git.
Consider this commit graph:
x -- 6f7fda9 -- A -- B -- C -- F <--(master)
\ ^(HEAD)
D -- E <--(branch)
So HEAD
is at master
which is at F
; and there's a branch which was (apparently) created from A
(after 6f7fda9
but before HEAD
).
Now the question is, given this graph what does 6f7fda9..HEAD
mean? And unfortunately, the answer isn't what a lot of people intuitively think.
6f7fda9..HEAD
is short for HEAD ^6f7fda9
- meaning "everything reachable from HEAD
but not reachable from 6f7fda9
". "Reachable" means "the commit itself, and any commits you find by following parent pointers". So in this case, it means A
, B
, C
, and F
; but not x
or 6f7fda9
(because they're reachable from 6f7fda9
) and also not D
, or E
(because they aren't reachable from HEAD
).
There are several ways to get filter-branch
to process all the branches. For example you could
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch myfile' --prune-empty --tag-name-filter cat -- --all
But this will include all refs (not just all branches); if that's a problem
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch myfile' --prune-empty --tag-name-filter cat -- --branches
One other caveat - if you specifically don't want commits before 6f7fda9
rewritten, then you need to include one or more negative commit references. But assuming you do intend to include 6f7fda9
itself, you'd exclude its parent (not itself).
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch myfile' --prune-empty --tag-name-filter cat -- ^6f7fda9^ --branches
If 6f7fda9
is a merge, you'd have to list negative commit references for each of its parents.
Upvotes: 4
Reputation: 488103
I want [the file] removed from all branches
It's important to realize that branches are almost (but not quite) irrelevant. What matters are the commits.
You literally cannot change any existing commit, and Git does not try. What git filter-branch
does is that it copies commits. That is, for each commit to be filtered, Git extracts the original into a temporary work area, applies your filter(s), and then makes a new commit from the result.
If the new commit is bit-for-bit identical to the original commit, it re-uses the actual underlying object in the repository database. If not—and the purpose is to result in "not"—the original commit remains, while the new copy gets a new, different hash ID. If we use uppercase letters to stand in for commit hash IDs, and remember that each commit stores the hash ID of its parent commit, we can draw the originals this way:
... <-F <-G <-H <-I <-- master
A branch name like master
remembers the hash ID of the last commit. That commit remembers the hash ID of its parent, which remembers another hash ID of another parent, and so on: master
lets Git find commit I
, which finds commit H
, which finds commit G
, and so on.
With git filter-branch
we tell Git: extract commit F
and maybe make some change to it and then re-commit. If nothing changes in F
, we stick with the actual hash ID. Then we have Git extract commit G
and make some change. This time, perhaps we remove a sensitive file. So we make a new commit that's like G
but different: it gets a new, different hash ID, which we can call G'
. Commit G'
still has commit F
as its parent:
...--F--G--H--I <-- master
\
G'
We then extract H
and apply the filter. Even if nothing else changes, we need our new commit to point back to G'
, so filter-branch ensures that this happens, and therefore we get a commit H'
that points back to G'
. We repeat for I
and the result is:
...--F--G--H--I <-- master
\
G'-H'-I'
The final step is for git filter-branch
to rewrite each of the branch names. The name master
must now point to commit I'
, with its new and different hash, not to shabby old icky I
.
The names that git filter-branch
rewrites at the end of its processing are all the names you identified positively on the command line. This part is a little tricky: git filter-branch
takes, as one / some of its arguments, strings that are suitable for git rev-list
. These can be positive references like master
, or negative references like ^develop
or ^6f7fda9
.
A negative reference tells Git: don't bother with these commits. If you use ^6f7fda9
to skip commit 6f7fda9
and anything "before" (graph-wise) that commit, git filter-branch
will not have to spend any computer-time working on that commit.
The expression 6f7fda9..HEAD
is shorthand for ^6f7fda9 HEAD
, and HEAD
means the current branch name. So this is a positive reference to one branch name (such as master
), and one negative reference by hash ID.
You can name all your branch names with --branches
. You can name all your references (including things that are not branch names) with --all
. Filter-branch will only rewrite the positive references, but it will rewrite all of them. Be a bit careful with this as this can rewrite refs/stash
for instance.
When you do rewrite any branch, tag, or other name that refers to some commit that does contain the file you don't want to have, you'll get things like:
tip2 [abandoned]
/
...--good--bad--...--tip [abandoned]
\
copied--...--tip' <-- branch1
\
tip2' <-- branch2
If you don't rewrite some name that points anywhere to any of the commits from bad
on down (rightward), those names will still point to the "bad" commits that have the file you want to be rid of. (Remember that in these particular graph drawings that I do on StackOverflow, earlier / parent commits are to the left, later / child commits are to the right.)
Upvotes: 1
Reputation: 94453
git filter-branch -- --all
runs the filter on all branches. So:
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch myfile' --prune-empty --tag-name-filter cat -- --all
Upvotes: 1