Reputation: 6788
I've committed a bunch of sensitive data to my local repo that has not been published yet.
The sensitive data is scattered across the project in different folders and I want to remove all these completely from git history.
All of the concerning folders have the same name, and are at the same level in the directory in different folders. Following is a sample of my folder structure:
root
folder1
./sensitiveData
folder2
./sensitiveData
folder3
./sensitiveData
using the following command, I am able to delete the folders containing sensitive data one at a time:
git filter-branch -f --index-filter 'git rm -r --cached --ignore-unmatch javascript/folder1/.sensitiveData' --prune-empty HEAD
But I want to delete all the folders containing sensitive data in one go, because, they are too many, and I would like to learn how this works.
But using the following command, nothing is rewritten and I am warned that 'refs/heads/master' is unchanged
is unchanged:
git filter-branch -f --index-filter 'git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData' --prune-empty HEAD
As I see it, there are two strategies:
Option one seems more sensible if possible.
Upvotes: 3
Views: 1151
Reputation: 6788
At the end, what solved my problem was a small bash script using the for in
construct.
for name in javascript/*/.sensitiveData
do git filter-branch -f --index-filter "git rm -r --cached --ignore-unmatch $name" --prune-empty HEAD
done
Upvotes: -1
Reputation: 487735
Your command, when you run it, is first evaluated by your shell. So with:
'git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData'
the single quotes protect the entire thing from the shell, and pass it to git filter-branch
as the --index-filter
to be used later. The single quotes are gone at this point.
Here's the problem: filters given to git filter-branch
get evaluated at filtering-time by another shell (technically, the shell that's running git filter-branch
itself). This other shell eval
s the command:
eval $filter
So now this second shell re-interprets:
git rm -r --cached --ignore-unmatch javascript/*/.sensitiveData
It breaks up the arguments at spaces, expands the asterisk based on the current working directory, and invokes git rm -r --cached --ignore-unmatched
on the result of the expansion.
If the expansion succeeds, one thing happens; if not, something else happens. Precisely what happens depends on the shell (bash can be configured to behave in several different ways; POSIX sh
is more predictable).
The actual current working directory for an --index-filter
is generally empty so the expansion will probably fail. This should, in most cases, pass the asterisk on unchanged to Git. Since the argument to git rm
is (mostly / essentially) a pathspec, Git will now do its own expansion. This should have worked, so either the path itself is wrong, or the directory is not empty, or there's something odd about your shell so that the failed expansion didn't pass the literal text javascript/*/.sensitiveData
to git rm
.
You can take some variables out of this equation by using:
'git rm -r --cached --ignore-unmatch javascript/\*/.sensitiveData'
so that the second shell sees:
git rm -r --cached --ignore-unmatch javascript/\*/.sensitiveData
which will force the second shell to pass:
javascript/*/.sensitiveData
directly to git rm
. Given that this probably should have worked anyway, though, it's of interest to check whether javascript/*/.sensitiveData
would match the right files in the specific commit(s), which you can do kind of clumsily / manually using git ls-tree -r
on those commits.
Upvotes: 2