Reputation: 21950
We have a project with around 500,000 lines of code, managed with git, much of it several years old. We're about to make a series of modifications to bring the older code into conformance with the developer community's current standards and best practices, with regards to naming conventions, exception handling, indentation, and so forth.
You can think of it as something between pretty printing and low level/mechanical refactoring.
This process is likely to touch almost every line of code in the code base (~85%), and some lines will be subject to as many as five modifications. All of the changes are intended to be semantically neutral.
Upvotes: 56
Views: 11147
Reputation: 4867
I don't know how best to deal with some of the more invasive changes you're describing, but...
Use these options to git blame
and git diff
to filter:
-w
option causes git to ignore changes in whitespace, so you can more easily see the real differences.-M
and -C
options make it follow renames and copies; in the case of git blame also moving and copying of fragments of code across files.See: explainshell.com - git diff -w -M -C
Upvotes: 29
Reputation: 747
This question has a good solution for it. Briefly use git filter-branch
.
I used for myself this code:
git filter-branch --tree-filter "git diff-tree --name-only --diff-filter=AM -r --no-commit-id \$GIT_COMMIT | grep '.*cpp\|.*h' | xargs ./emacs-script" HEAD
Which ./emacs-script
is a script I wrote using emacs to change the code-style, it simply just call indent-region
on each file.
This code works fine if there is not any file that deleted or removed from repository, On that situation using --ignore-unmatch
may be helpful but I'm not sure.
Upvotes: 0
Reputation: 77171
You will also need a mergetool that allows agressive ignoring of whitespace. p4merge does this, and is freely downloadable.
Upvotes: 10
Reputation: 1324278
I would recommend making those evolutions one step at a time, in a central Git repo (central as in "public reference for all other repositories to follow):
But not "indentation-reordering-renaming-...-one giant commit".
That way, you give to Git a reasonable chance to follow the changes across refactoring modifications.
Plus, I would not accept any new merge (pulled from other repo) which do not have applied the same refactoring before pushing their code.
If applying the format process brings any changes to the fetched code, you could reject it and ask for the remote repo to conform to the new standards first (at least by pulling from your repo before making any more push).
Upvotes: 13