Reputation: 21319
I am trying to salvage code from a CVS repo. I am using reposurgeon
for the purpose, and I have tried the following tools to get myself a git-fast-import
stream:
cvs-fast-export
, which errors out (alleged cyclic branch, but doesn't provide details)cvs2git
followed by git-fast-export
, which mashes up things beyond comprehensiongit-cvsimport
followed by git-fast-export
, which creates the best results so far, but also ends up throwing stuff on branches that they don't belong on.This CVS repo has been run on a variety of CVS versions and tags and branches have been forcibly moved. I know that this means I cannot salvage those branches and tags anymore. But so be it.
Nevertheless I have half a dozen branches (out of many many more), plus MAIN
, which I am interested in retaining during converion into a git-fast-import
stream. My target VCS is not Git, but the point is that reposurgeon
handles its input this way and outputs this way, too.
In order to make sense of the artifacts and clean as much of the old stuff (including orphaned revisions) out in a pre-processing stage by means of rcs -o<rev>
(of course on a copy of my repo ;)), I need to understand how the innards of the rcsfile
format work.
Parsing is a piece of cake after modifying the rcsfile.py
module from rcsgrep
. But that doesn't yet provide me with any information about what the revision numbers, especially those without a corresponding delta+log, mean.
According to the RCS files man page, there shouldn't be a case where the third segment of a revision ID is 0. Yet I see exactly that condition.
Here is what I did (as an experiment).
MAIN
: commit a file (1.1
)MAIN
: branch to BranchX
(1.1
)BranchX
: change the file (1.1.2.1
)BranchX
: change the file again (1.1.2.2
)MAIN
: change the file (1.2
)MAIN
: tag the file foobar
(1.2
)MAIN
: branch to BranchX
, moving the branch tag (1.2
), effectively orphaning the previous branch at 1.1.2.x
BranchX
: delete the file (1.2.2.1
)MAIN
: change the file (1.3
)MAIN
: forcibly tag the file foobar
(1.3
)MAIN
: change the file (1.4
)MAIN
: tag the file foobarbaz
(1.4
)As you can see in the list above and also in the fully reproduced file below, there is no revision 1.2.0.2
in the form of delta with log.
If I branch off revision x.y
freshly (no file changes!), the resulting revision ID is x.y.0.2
. That is similar to the mysterious revision ID I am seeing and asking about.
0
indicate that the file doesn't have deltas, such that I have to go back to its ancestor for the actual contents?Can anyone shed light on these questions or point to more comprehensive material than the above linked man page?
Below is the full RCS file:
head 1.4;
access;
symbols
foobarbaz:1.3
foobar:1.4
BranchX:1.2.0.2;
locks; strict;
comment @# @;
1.4
date 2014.12.11.13.46.46; author username; state Exp;
branches;
next 1.3;
1.3
date 2014.12.11.13.44.49; author username; state Exp;
branches;
next 1.2;
1.2
date 2014.12.11.13.39.31; author username; state Exp;
branches
1.2.2.1;
next 1.1;
1.1
date 2014.12.11.13.31.41; author username; state Exp;
branches
1.1.2.1;
next ;
1.1.2.1
date 2014.12.11.13.34.36; author username; state Exp;
branches;
next 1.1.2.2;
1.1.2.2
date 2014.12.11.13.35.08; author username; state Exp;
branches;
next ;
1.2.2.1
date 2014.12.11.13.42.32; author username; state dead;
branches;
next ;
desc
@@
1.4
log
@Change on MAIN
@
text
@NOTE: this file will be removed!
Another change on MAIN@
1.3
log
@Change on MAIN
@
text
@d3 1
a3 1
ANother change on MAIN@
1.2
log
@Change on MAIN
@
text
@d3 1
a3 1
File on MAIN will be forcibly tagged X again ... how does this affect the rev ID?@
1.2.2.1
log
@Removing the two files from X
@
text
@@
1.1
log
@Adding the experiment file
@
text
@d3 1
a3 1
Introducing file on MAIN@
1.1.2.1
log
@Changing the file on the X branch
@
text
@d3 1
a3 1
Changing on X branch@
1.1.2.2
log
@Another change on the X branch
@
text
@d3 1
a3 1
Another change on the X branch@
Upvotes: 1
Views: 666
Reputation: 21319
Okay, turns out the answer to this is buried deep down in the CVS source code.
For starters here are the important files if you are looking at the CVS source tree:
src/rcs.c
src/rcs.h
doc/RCSFILES
In addition to that you have the rcsfile(5)
man page. And don't forget to use grep
to the utmost extend (unless you have something more sophisticated at your disposal, that is).
x.y.z
, e.g. 1.1.2
, which is a branch off of revision 1.1
.
x.y.0.z
, or 1.1.0.2
. Where 0 is a magic value defined as RCS_MAGIC_BRANCH
in the CVS code. Note that no delta will ever have the third segment set to 0
, as these are "virtual revision numbers".z
(third segment of a branch revision, fourth of a virtual revision number) will ever only be an even number equal or bigger than two
assert((z >= 2) && (z % 2 == 0))
1
is also reserved for vendor branches as per the comment in rcs.h
(see below).symbols
list in the admin section of the RCS file (e.g. via rlog -h <file>
, if you don't want to parse it) for revisions which have the second-to-last segment set to 0
. That is, you have a revision that would match the (PCRE) regular expression (?:\d+\.\d+\.)+0\.\d+
(hope I got that right).rcs.h
CVS reserves all even-numbered branches for its own use. "magic" branches (see
rcs.c
) are contained as virtual revision numbers (within symbolic tags only) off theRCS_MAGIC_BRANCH
, which is0
. CVS also reserves the".1"
branch for vendor revisions. So, if you do your own branching, you should limit your use to odd branch numbers starting at3
.
Interesting functions using RCS_MAGIC_BRANCH
are RCS_tag2rev()
and RCS_gettag
.
rcs.c
Comment on RCS_magicrev()
:
Return a "magic" revision as a virtual branch off of
REV
for the RCS file. A "magic" revision is one which is unique in the RCS file. By unique, I mean we return a revision which:
- has a branch of
0
(seercs.h
RCS_MAGIC_BRANCH
)- has a revision component which is not an existing branch off
REV
- has a revision component which is not an existing magic revision
- is an even-numbered revision, to avoid conflicts with vendor branches The first point is what makes it "magic".
As an example, if we pass in
1.37
asREV
, we will look for an existing branch called1.37.2
. If it did not exist, we would look for an existing symbolic tag with a numeric part equal to1.37.0.2
. If that didn't exist, then we know that the1.37.2
branch can be reserved by creating a symbolic tag with1.37.0.2
as the numeric part.[...]
Note: We assume that REV is an RCS revision and not a branch number.
0
indicate that the file doesn't have deltas, such that I have to go back to its ancestor for the actual contents?
0
, the revision number is a virtual revision number used to make a "reservation" for a branch number.0
simply indicate the "root" of that branch, with the fourth segment being the latest revision on that branch?
Upvotes: 1