Reputation: 4845
I have a large file with complex history (many commits from many authors).
Refactoring it would suppose to split it in multiple small files, BUT, i need to keep history.
To fix the ideas, let's say I have a main
file containing all my code :
function a() {}
function b() {}
function c() {}
function main() {
a();
b();
c();
}
and I need to move the a
and the b
functions to a
and b
files respectively while keeping my main
function in the main
file -- WHILE keeping history in the three files.
I found some kind of solution there, but nothing that actually works or is practical in a production environment.
Upvotes: 7
Views: 2821
Reputation: 13
The only way I found (for now) to keep history (avoiding additional git blame
arguments?), is by split keeping changes and then merging while keeping changes.
git mv source source-aux # Rename the base file ("delete")
cp source-aux target1-aux # Repeat this for each target file
git add target1-aux # target2-aux target3-aux ...
# (all aux files per target, divided by spaces)
git commit -m 'rename original into one of the aux copies'
# Use you preferred editor (nano, vim, vscode, ...)
# to clean extra sections on each aux file:
nano source-aux # Edit source/base file
nano target1-aux # Edit each target aux file
git commit -m 'clean copies'
git mv source-aux source # Rename the original/source file
# as it was initially ("restore")
git commit -m 'revert name of original file'
git checkout -b rename-targets
git mv target1 target1-ren # Do this for each target file
git commit -m 'rename original targets'
git checkout -
git mv target1-aux target1-ren # Do this for each target file
git commit -m 'rename aux targets'
git merge -m 'combine with renamed' rename-targets
As you might have noted, it is expected here to be merge conflict(s), which is ok because both branches have files with same names but completely different contents.# For sake of this example, let's assume that each
# conflict gets resolved by just concatenating the
# changes of both branches, as follows, per each target file:
cat "target1-ren~HEAD" "target1-ren~rename-targets" > target1-ren
# Add all target files to mark its conflicts as resolved:
git add target1-ren # target2-ren target3-ren ...
# (all aux renamed files per
# target, divided by spaces)
git merge --continue
git mv target1-ren target1 # Do this for each target file
git commit -m 'restore original target filenames'
Upvotes: 1
Reputation: 164639
Move the code as normal. Git can help you read the history.
Use git blame -w -n -M -C -C -C
. I like to alias this as archeology
.
-w
ignores trivial whitespace changes.-n
shows the line number of the original commit.-M
detects moved or copied lines within a file.-C -C -C
detects lines moved or copied from other files in any commit.Similarly, use git log -w -M -C -C -C
.
You can also make the archeology easier by copying the code in one commit, and changing it in the next. Then when you're reading back through the blame history you'll hit a commit that says "split up file X".
Ultimately, you spend orders of magnitude more time changing the code than doing code archeology. It doesn't make sense to optimize your development process for code archeology. Instead, change the code as needed and use Git more effectively. And if, in the end, the archeology is a little more difficult that's fine; it's better than making development more difficult.
Sooner than you'd think, especially if you embrace change as a normal part of development, nobody will care where the original lines came from.
Upvotes: 7