AxD
AxD

Reputation: 3158

How can I convert SVN to Git while splitting one huge repository into separate repositories?

I'd like to migrate our SVN repository to git.

Our current repository is a huge singleton pile comprising a number of Visual Studio solutions, all residing in separate sub directories of the repository.

When transforming it to git I'd like to split the SVN repository into separate git repositories for each solution while maintaining each solution's history at the same time.

I don't want the history of the whole SVN repository in all of our future git repositories. All I want in these future git repositories is the history of a particular sub directory.

Is this possible?


Current SVN repository file structure:

svn_base
   |-- Solution1
   |   |-- 1.cs
   |   |-- 1.csproj
   |   |-- 1.sln
   |-- Solution1
   |   |-- 2.cs
   |   |-- 2.csproj
   |   |-- 2.sln
   |-- Solution3
   |   |-- 3.cs
   |   |-- 3.csproj
   |   |-- 3.sln

Desired git respository file structure:

Solution1
   |-- .git
   |-- 1.cs
   |-- 1.csproj
   |-- 1.sln

Solution2
   |-- .git
   |-- 2.cs
   |-- 2.csproj
   |-- 2.sln


Solution3
   |-- .git
   |-- 3.cs
   |-- 3.csproj
   |-- 3.sln

Upvotes: 2

Views: 1441

Answers (2)

Elias Holzmann
Elias Holzmann

Reputation: 3639

While the answer @acran gave does solve the problem, it is also possible and sometimes advantageous to first convert the SVN repository to Git and to then split the big monorepo into multiple smaller repositories.

1. Converting SVN to Git

If your SVN repository has a standard layout (subdirectories branches, tags and trunk) and you don't need any other bells and whistles, this is quite easy:

$ git svn clone <url_to_subversion_repo>

This command has two gotchas:

  1. git svn uses the SVN login as Git author name. It also uses some default mail address (<author_name>@localhost, I think, though I am not sure). If this is not what you want, you can use an authors file. Add a file user_mapping.txt mapping SVN users to git users:
    svn_user_1 = Git User 1 <[email protected]>
    svn_user_2 = Git User 2 <[email protected]>
    
    And then call git svn clone with this file:
    $ git svn clone --authors-file=user_mapping.txt <url_to_subversion_repo>
    
  2. As SVN tags can change, git svn imports them as Git branches. If you want, you can convert them.

git svn clone checks out every revision of your SVN repo in order from the SVN server – if you have a big repository, this will take a while (my experience were multiple hours for ~50,000 revisions, I think, though I am not sure, this was years ago). If possible, you may want to run this command on the SVN server, espacially if you have a slow connection to it. Either way, go grab a cup of coffee (or five).

2. Splitting the Git repository

There are multiple tools to split up Git repositories into sub repositories. See for example this question. When I did this a few years ago, I used git filter-branch, but this tool is now deprecated – you may still use it, or you may use git filer-repo, though I don't have any experience with this tool.

The most upvoted answer to the question I linked uses git subtree filter – I suggest not to use this answer, as git subtree filter only converts one branch, in effect removing all other branches from your subrepositories.

Advantages

What are the advantages of this answer over converting every sub repository via git svn clone?

  • You only need to clone the SVN repository once. This is probably faster than cloning the sub folders for each project (though I have not tested this, it is only an educated guess).
  • Cloning a SVN repository with the standard layout is better tested than cloning a SVN repository with a non-standard layout. In my experience, git svn does not always do what you want it to, so a more standard usage is probably more likely to result in what you want.
  • If you want to rewrite the history of your new Git repositories (for example, to remove big binary files), you can rewrite the history of the monorepo between step one and step two. It would be a bigger effort to do this for every new sub repository.

Upvotes: 6

acran
acran

Reputation: 8933

If your projects are neatly separated into their own subdirectories this should be quite straight forward using the --trunk parameter to git svn init/git svn clone:

git svn clone --trunk=Solution1 $SVN_URI ./Solution1

This will clone the only the history of the subfolder Solution1 into a new git repository in the directory ./Solution1. It will only include commits that touch files in that subfolder and it will adjust the relative path so that the subfolder is the root directory of the new git repository.

Upvotes: 2

Related Questions