kynan
kynan

Reputation: 13643

How to migrate large bzr project with many branches to git and filter history

After several years of using bazaar (and launchpad), we're planning to migrate the FEniCS project over to Git. We have a few requirements for this migration, which make it rather complex:

  1. We want to preserve history, but filter it and strip out a list of (now obsolete) files from history to get the repository size down.
  2. There are many feature branches from many independent contributors (currently 76 branches from 25 different people). We'd like to give them an easy migration path (doable by a git novice) for getting their brances into the converted and filtered repository.

There's a solution for 1.

I'm taking DOLFIN as an example:

Import the bzr trunk:

git init dolfin && cd dolfin
(cd path/to/bzr/trunk; bzr fast-export --plain) | git fast-import

Filter history:

git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch ${files_to_strip}" --prune-empty

There's also a solution for 2.

It requires exporting marks files for both the bzr fast-export (dolfin.marks.bzr) and the git fast-import (dolfin.marks.git) steps above. We could make those available to contributors s.t. they can import their feature branches like this:

(cd path/to/bzr/branch; bzr fast-export --marks=path/to/dolfin.marks.bzr --git-branch=$(bzr nick)) | \
git fast-import --import-marks=path/to/dolfin.marks.git --export-marks=path/to/dolfin.marks.git

However this recipe breaks down when we filter the branch as this operation invalidates the SHA1 hashes of all the trunk commits and therefore the marks files.

But there is no solution for 1. and 2.

So the question is: Is there a recipe that reliably satisfies both requirements 1. and 2.?

Note that this should ideally also work for complex cases like feature branches that have had the trunk merged back in (even several times): the parent of these merges coming from the trunk should be correctly identified (as they are in 2.).

Upvotes: 3

Views: 408

Answers (1)

jelmer
jelmer

Reputation: 2450

Unfortunately there is no way to do this for both 1 and 2 at the moment with marks files.

If you ignore the marks files (just don't generate them), have users do the full conversion and make sure that the filtering happens consistently for all users then you should end up with the same SHA1s and thus the same common history everywhere.

Upvotes: 1

Related Questions