Reputation: 21299
If you want to get to the actual question, scroll to the bottom of the question. I just felt it necessary to explain the circumstances.
In our company we have, for historical reasons, several version control systems. Currently we are trying to move to any git-fast-import
-compatible distributed version control system, really, but our pick is Mercurial at the moment. I say at the moment, because once you have taken that step, it's easier to migrate from one DVCS to another in most cases.
We have essentially three code bases that we want to join plus a part that has been committed into one SVN repository, which we want to separate out.
So we have:
The huge repo (2.) contains snapshots of the state of the CVS repo (1.) at different points in time. Obviously none have been tagged in the CVS repo, because that'd be potentially useful. On top of that the snapshots have patches applied on top of that snapshot state.
This is to say that a subfolder hierarchy in 2. corresponds roughly to 1.. However, there is no need to worry about it, as the idea is to retire either one of those folders after initially splicing them under distinct path names. So no naming clashes to be expected here.
reposurgeon
as my tool of choice. This is a very powerful tool allowing, indeed, surgical operations on git-fast-import
streams. I warmly recommend it to anybody tasked with similar migrations.git-fast-import
stream, btw)cvs-fast-export
, now maintained by Eric S. Raymond, also the author of reposurgeon
. I have also contemplated conversion to SVN, just to find that the toolset (cvs2svn
) used to do that has been extended to export to Mercurial as well.While the SVN conversions took a long time to get to the point where we can call it done, the CVS conversion is still in progress.
Since CVS doesn't have a repository-wide revision history all tools have to attempt to parse the RCS files and make sense of their contents to piece together the puzzle.
Some of the really bad scars I was able to remove manually by literally editing the locked RCS file in an editor (after taking backups). This way some invalid revisions (RCS and CVS have a different idea of what is a valid revision number) as well as symbols that appeared as tags in some files and as branches in others have been weeded out.
I am also able to preprocess the (CVS) repository to remove a lot of the branches and tags which we do not need, prior to the branches we are interested in (rcsfile.py
from rcsgrep
helped). Basically prior to that certain point, we only want the contents of MAIN
/trunk
/default
/master
, whatever you want to call it.
However, some of the tools outright fail (e.g. cvs-fast-export
crashes) and others give results that are somewhat mangled.
Not too bad, one can demangle a lot by means on reposurgeon
. However, half a dozen of branches never even make it to the converted repository.
The reason appears to be in all cases that all tools get confused by a particular peculiarity you wouldn't find in SVN, for example.
If branch tags get "moved" forcibly (cvs tag -B
), then the originally allocated branch number in the RCS file gets orphaned and another new branch number will take its place. However, the old revisions remain in the file.
Now the new branch started perhaps hours, days or months after the original branching took place. This appears to be what upsets all those tools.
While it would be cool to also include the orphaned branches and mend those "wounds", it's not a priority. Most of the files treated with cvs tag -B
are not source files, but files like GNUmakefile
or other project files.
However, the problem remains, that the CVS conversion isn't finished and will take some more time.
And managers grow impatient ...
Is it possible to start out with the two SVN repositories spliced into a single Hg repository and later (when the CVS conversion is finished) splice in those changes without having to initialize yet another unrelated Hg repo?
The (CVS repo) splicing would not cause conflicting paths, I have to say up front. The other repository is meant to be spliced in via its own subdirectory, so no name clashes.
I know that pushes and pulls can introduce commits from two years ago into someone's repository today. However, does this mean that a hg transplant
would be likely to succeed as well? I.e. can I expect to be able to transplant those commits from a decade ago into the joint Hg repository?
This way I could split the migration into stages.
Is this technically feasible by means of hg transplant
(or any other hg
extensions for that matter)?
If it is, I'll appreciate any advice about potential caveats as well.
Upvotes: 3
Views: 234