Reputation: 5862
I have a git repo with several directories, and a single file, MyFile.ext
.
/
LargeDir1/
LargeDir2/
LargeDir3/
.
.
.
MyFile.ext
I'd like to start a new repo with just MyFile.ext
in it, and keep all the history pertaining to it, but ignore everything else (all the LargeDir
s). How can I do this?
For directories, I've successfully used this answer, but I tried that on a single file, and it doesn't work.
I've also tried this answer, which does delete everything except my file, but it also seems to leave all the history around.
Upvotes: 16
Views: 2380
Reputation: 11916
Git now recommends using git filter-repo instead (you get a message about it when using filter-branch). Another answer on one of the questions you linked has a long explanation, but here's a short example.
To remove everything except src/README.md and move it to the root:
pip install git-filter-repo
# Must use a fresh clone to avoid losing local history.
git clone --no-local project extracted
cd extracted/
git filter-repo --path src/README.md
git filter-repo --subdirectory-filter src/
We use --path
selects the single file and --subdirectory-filter
moves the contents of that directory to root. I can't find a way to do this in a single pass, but the second pass is much faster since the first eliminates most of the history.
Upvotes: 0
Reputation: 43495
Use git fast-export
.
First you export the history of the file to a fast-import stream. Make sure you do this on the master
branch.
cd oldrepo
git fast-export HEAD -- MyFile.ext >../myfile.fi
Then you create a new repo and import.
cd ..
mkdir newrepo
cd newrepo
git init
git fast-import <../myfile.fi
git checkout
Upvotes: 24
Reputation: 164739
Cloning can be done normally with git clone
. That will work fine on a directory like git clone /path/to/the/repo
. Then remove remote pointing back to the clone.
git clone /path/to/the/repo
git remote rm origin
Then use git filter-branch
to filter out everything but that one file. This is easiest to accomplish with an index filter that deletes all files and then restores just the one.
git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME
An index filter works by checking out each individual commit with all the changes staged. You're running this command, and then recommitting it. It first removes all the changes from staging, then restores that one file to its state in that commit. $GIT_COMMIT
is the commit being rewritten. YOURFILENAME
is the file you want to keep.
If you're doing all branches and tags with --all
, add a tag filter which ensures the tags are rewritten. That's as simple as --tag-name-filter cat
. It will not change the content of the tags, but it will ensure they're moved to the rewritten commits.
Finally, you'll want --prune-empty
to remove any now empty commits that didn't involve that file. There will be a lot of them.
Here it is all together.
git filter-branch \
--index-filter 'git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME' \
--tag-name-filter cat
--prune-empty \
-- --all
Upvotes: 0
Reputation: 1094
I had this same issue and I finally figured it out. I had an old old directory of scripts - so old, they had originally been under RCS control. Years ago, I made it into a git repo (without really knowing what I was doing) and I converted the RCS log and update the git log. But I picked up development of one of the scripts and decided it needed its own repo. The various solutions out there (subtree and filter-branch) depend on the part you're splitting out to be a directory. You can put the file in a directory and split it out that way, but you don't get the revision history with it. So here's how I figured out how to extract the revision history of a single file and create a new repo with it:
Create a branch new repo [I did it at the same level as the source-repo]
git init <new-repo>
Now go into your source repo and create a file that we're going to use later to cherry-pick the file's commits:
cd <source-repo>
git log --reverse <target-file.ext> | \
grep ^commit | cut -d ' ' -f 2 | cut -c 1-7 | \
perl -ne 'print("pick $_")' > ../commits-to-keep.txt
Create a temporary branch and push it to your new repo (then delete it)
git checkout -b tmpbranch
git push ../new-repo tmpbranch
git checkout master
git branch -d tmpbranch
Now go to your new repo and create an empty commit off of which we will rebase:
cd ../<new-repo>
git commit --allow-empty -m 'root commit'
git rebase --onto master --root tmpbranch -i
[The only manual step] In the editor that comes up from the last command above, remove all the contents and paste in the contents of the file you created earlier: ../commits-to-keep.txt
Now you can switch back to the master branch, merge, and then clean up the temporary branch:
git checkout master
git merge tmpbranch
git branch -d tmpbranch
The only drawback here is that you end up with the extra empty root commit. I found that there are ways to remove it, but for my purposes, this was good enough.
Upvotes: 0