Kris Harper
Kris Harper

Reputation: 5862

How can I split a single file from a git repo into a new repo?

I have a git repo with several directories, and a single file, MyFile.ext.

/
  LargeDir1/
  LargeDir2/
  LargeDir3/
      .
      .
      .
  MyFile.ext

I'd like to start a new repo with just MyFile.ext in it, and keep all the history pertaining to it, but ignore everything else (all the LargeDirs). How can I do this?

For directories, I've successfully used this answer, but I tried that on a single file, and it doesn't work.

I've also tried this answer, which does delete everything except my file, but it also seems to leave all the history around.

Upvotes: 16

Views: 2380

Answers (4)

idbrii
idbrii

Reputation: 11916

Git now recommends using git filter-repo instead (you get a message about it when using filter-branch). Another answer on one of the questions you linked has a long explanation, but here's a short example.

To remove everything except src/README.md and move it to the root:

pip install git-filter-repo
# Must use a fresh clone to avoid losing local history.
git clone --no-local project extracted
cd extracted/
git filter-repo --path src/README.md
git filter-repo --subdirectory-filter src/

We use --path selects the single file and --subdirectory-filter moves the contents of that directory to root. I can't find a way to do this in a single pass, but the second pass is much faster since the first eliminates most of the history.

Upvotes: 0

Roland Smith
Roland Smith

Reputation: 43495

Use git fast-export.

First you export the history of the file to a fast-import stream. Make sure you do this on the master branch.

cd oldrepo
git fast-export HEAD -- MyFile.ext >../myfile.fi

Then you create a new repo and import.

cd ..
mkdir newrepo
cd newrepo
git init
git fast-import <../myfile.fi
git checkout

Upvotes: 24

Schwern
Schwern

Reputation: 164739

  1. Clone the repo.
  2. Filter out everything but that one file.

Cloning can be done normally with git clone. That will work fine on a directory like git clone /path/to/the/repo. Then remove remote pointing back to the clone.

git clone /path/to/the/repo
git remote rm origin

Then use git filter-branch to filter out everything but that one file. This is easiest to accomplish with an index filter that deletes all files and then restores just the one.

git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME

An index filter works by checking out each individual commit with all the changes staged. You're running this command, and then recommitting it. It first removes all the changes from staging, then restores that one file to its state in that commit. $GIT_COMMIT is the commit being rewritten. YOURFILENAME is the file you want to keep.

If you're doing all branches and tags with --all, add a tag filter which ensures the tags are rewritten. That's as simple as --tag-name-filter cat. It will not change the content of the tags, but it will ensure they're moved to the rewritten commits.

Finally, you'll want --prune-empty to remove any now empty commits that didn't involve that file. There will be a lot of them.

Here it is all together.

git filter-branch \
    --index-filter 'git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME' \
    --tag-name-filter cat
    --prune-empty \
    -- --all

Upvotes: 0

hepcat72
hepcat72

Reputation: 1094

I had this same issue and I finally figured it out. I had an old old directory of scripts - so old, they had originally been under RCS control. Years ago, I made it into a git repo (without really knowing what I was doing) and I converted the RCS log and update the git log. But I picked up development of one of the scripts and decided it needed its own repo. The various solutions out there (subtree and filter-branch) depend on the part you're splitting out to be a directory. You can put the file in a directory and split it out that way, but you don't get the revision history with it. So here's how I figured out how to extract the revision history of a single file and create a new repo with it:

  1. Create a branch new repo [I did it at the same level as the source-repo]

    git init <new-repo>
    
  2. Now go into your source repo and create a file that we're going to use later to cherry-pick the file's commits:

    cd <source-repo>
    git log --reverse <target-file.ext> | \
        grep ^commit | cut -d ' ' -f 2 | cut -c 1-7 | \
        perl -ne 'print("pick $_")' > ../commits-to-keep.txt
    
  3. Create a temporary branch and push it to your new repo (then delete it)

    git checkout -b tmpbranch
    git push ../new-repo tmpbranch
    git checkout master
    git branch -d tmpbranch
    
  4. Now go to your new repo and create an empty commit off of which we will rebase:

     cd ../<new-repo>
     git commit --allow-empty -m 'root commit'
     git rebase --onto master --root tmpbranch -i
    
  5. [The only manual step] In the editor that comes up from the last command above, remove all the contents and paste in the contents of the file you created earlier: ../commits-to-keep.txt

  6. Now you can switch back to the master branch, merge, and then clean up the temporary branch:

     git checkout master
     git merge tmpbranch
     git branch -d  tmpbranch
    

The only drawback here is that you end up with the extra empty root commit. I found that there are ways to remove it, but for my purposes, this was good enough.

Upvotes: 0

Related Questions