Karol Selak
Karol Selak

Reputation: 4774

How to replace a string in whole Git history?

I have one of my passwords commited in probably few files in my Git repo. Is there some way to replace this password with some other string in whole history automatically so that there is no trace of it? Ideally if I could write simple bash script receiving strings to find and replace by and doing whole work itself, something like:

./replaceStringInWholeGitHistory.sh "my_password" "xxxxxxxx"

Edit: this question is not a duplicate of that one, because I am asking about replacing strings without removing whole files.

Upvotes: 32

Views: 13437

Answers (4)

Alesh17
Alesh17

Reputation: 386

I also faced to this problem, so I write a little gist for this: https://gist.github.com/Alesh17/ad73a8c7600139fc0804a0853b2f16c5

The most complex part of it - is push your changes to the remote repo, so I write script for this (force pushing all branches and tags).

Happy coding!

Upvotes: 0

git filter-repo --replace-text

Git 2.25 man git-filter-branch already clearly recommends using git filter-repo instead of git filter-tree, so here we go.

Install https://superuser.com/questions/1563034/how-do-you-install-git-filter-repo/1589985#1589985

python3 -m pip install --user git-filter-repo

and then use:

echo 'my_password==>xxxxxxxx' > replace.txt
git filter-repo --replace-text replace.txt

or equivalent with Bash magic:

git filter-repo --replace-text <(echo 'my_password==>xxxxxxxx')

Tested with this simple test repository: https://github.com/cirosantilli/test-git-filter-repository and replacement strings:

d1==>asdf
d2==>qwer

The above acts on all branches by default (so invasive!!!), to act only on selected branches use: git filter-repo: can it be used on a specific branch? e.g.:

--refs HEAD
--refs refs/heads/master

and only to act on a specified commit range you can: How to modify only a range of commits with git filter-repo instead of the entire branch history?

--refs HEAD~2..master
--refs HEAD~2..HEAD

The option --replace-text option is documented at: https://github.com/newren/git-filter-repo/blob/7b3e714b94a6e5b9f478cb981c7f560ef3f36506/Documentation/git-filter-repo.txt#L155

--replace-text <expressions_file>::

A file with expressions that, if found, will be replaced. By default, each expression is treated as literal text, but regex: and glob: prefixes are supported. You can end the line with ==> and some replacement text to choose a replacement choice other than the default of ***REMOVED***.

How to replace in a single file: git-filter-repo replace text by expression in a single file

Of course, once you've pushed a password publicly, it is always too late, and you will have to change the password, so I wouldn't even bother with the replace in this case: Remove sensitive files and their commits from Git history

Related: How to substitute text from files in git history?

Tested on git-filter-repo ac039ecc095d.

Upvotes: 37

Karol Selak
Karol Selak

Reputation: 4774

At the beginning I'd like to thank ElpieKay, who posted core functions of my solutions, which I've only automatized.

So, finally I have script I wanted to have. I divided it into pieces which depend on each other and can serve as independent scripts. It looks like this:

censorStringsInWholeGitHistory.sh:

#!/bin/bash
#arguments are strings to censore

for string in "$@"
do
  echo ""
  echo "================ Censoring string "$string": ================"
  ~/replaceStringInWholeGitHistory.sh "$string" "********"
done

usage:

~/censorStringsInWholeGitHistory.sh "my_password1" "my_password2" "some_f_word"

replaceStringInWholeGitHistory.sh:

#!/bin/bash
# $1 - string to find
# $2 - string to replace with

for branch in $(git branch | cut -c 3-); do
  echo ""
  echo ">>> Replacing strings in branch $branch:"
  echo ""
  ~/replaceStringInBranch.sh "$branch" "$1" "$2"
done

usage:

~/replaceStringInWholeGitHistory.sh "my_password" "********"

replaceStringInBranch.sh:

#!/bin/bash
# $1 - branch
# $2 - string to find
# $3 - string to replace with

git checkout $1
for file in $(~/findFilesContainingStringInBranch.sh "$2"); do
  echo "          Filtering file $file:"
  ~/changeStringsInFileInCurrentBranch.sh "$file" "$2" "$3"
done

usage:

~/replaceStringInBranch.sh master "my_password" "********"

findFilesContainingStringInBranch.sh:

#!/bin/bash

# $1 - string to find
# $2 - branch name or nothing (current branch in that case)

git log -S "$1" $2 --name-only --pretty=format: -- | sort -u

usage:

~/findFilesContainingStringInBranch.sh "my_password" master

changeStringsInFileInCurrentBranch.sh:

#!/bin/bash

# $1 - file name
# $2 - string to find
# $3 - string to replace

git filter-branch -f --tree-filter "if [ -f $1 ];then sed -i s/$2/$3/g $1;fi"

usage:

~/changeStringsInFileInCurrentBranch.sh "abc.txt" "my_password" "********"

I have all those scripts located in my home folder, what is necessary for proper working in this version. I'm not sure that's the best option, but for now I cannot find better one. Of course every script has to be executable, what we can achieve with chmod +x ~/myscript.sh.

Probably my script is not optimal, for big repos it will process very long, but it works :)

And, at the very end, we can push our censored repo to any remote with:

git push <remote> -f --all

Edit: important hint from ElpieKay:

Don't forget to delete and recreate tags that you have pushed. They are still pointing to the old commits that may contain your password.

Maybe I'll improve my script in future to do this automatically.

Upvotes: 5

ElpieKay
ElpieKay

Reputation: 30868

First, find all the files that could contain the password. Suppose the password is abc123 and the branch is master. You may need to exclude those files which have abc123 only as a normal string.

git log -S "abc123" master --name-only --pretty=format: | sort -u

Then replace "abc123" with "******". Suppose one of the files is foo/bar.txt.

git filter-branch --tree-filter "if [ -f foo/bar.txt ];then sed -i s/abc123/******/g foo/bar.txt;fi"

Finally, force push master to the remote repository if it exists.

git push origin -f master:master

I made a simple test and it worked but I'm not sure if it's okay with your case. You need to deal with all the files from all branches. As to the tags, you may have to delete all the old ones, and create new ones.

Upvotes: 17

Related Questions