Paul Nathan
Paul Nathan

Reputation: 40319

Backing up a filesystem containing hg repos

Is it possible to backup a filesystem with many Mercurial repositories (e.g., with rsync on the filesystem) and have the backup in an inconsistent state?

The repositories are served by ssh and serves this set of requests: {push, pull, in, out, clone}. It does not have 'hg commit' applied to it directly (which has a known race condition).

Upvotes: 1

Views: 1027

Answers (3)

Martin Geisler
Martin Geisler

Reputation: 73788

Mark Drago is correct that Mercurial writes its own files in a careful order to maintain integrity. However, this is only integrity with regard to other Mercurial clients. The locking design in Mercurial allows one Mercurial process to create a new commit by writing files in this order:

  1. filelogs (holds compressed deltas for all revisions of a given file)
  2. manifest (has pointers back to the filelogs associated with a given changeset)
  3. changelog (has metadata and a pointer back to the manifest for the changeset)

while other Mercurial processes will read the files in this order

  1. changelog
  2. manifest
  3. filelogs

The reader will thus not see a reference to the new filelog data since the changelog is updated last in an atomic operation (a rename, which POSIX requires to be atomic).

A backup program will not know the correct order to read the Mercurial files and so it might read a filelog before it was updated by Mercurial and then read a manifest after it was updated:

  1. rsync reads .hg/store/data/foo.i
  2. hg writes .hg/store/data/foo.i
  3. hg writes .hg/store/00manifest.i
  4. hg writes .hg/store/00changelog.i
  5. rsync reads .hg/store/00manifest.i
  6. rsync reads .hg/store/00changelog.i

The result is a backup with a changelog that points to a manifest that points to a filelog revision that does not exist --- a corrupt repository. Running hg verify on such a repository will detect this situation:

checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
 foo@1: f57bae649f6e in manifests not found
1 files, 2 changesets, 1 total revisions
1 integrity errors encountered!
(first damaged changeset appears to be 1)

This tells you that the manifest of revision 1 refers to revision f57bae649f6e of the file foo, which cannot be found. It is possible to repair this situation by making a clone that excludes the bad revision 1:

$ hg clone -r 0 . ../repo-fixed
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cd ../repo-fixed
$ hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
1 files, 1 changesets, 1 total revisions

So, all in all, it is not that bad if you use a general backup program to backup your Mercurial repositories. Just be aware that you might have to repair a broken repository after you restore it from backup. The changeset you lose will most likely still be on the developer's machine and he can push it again after you repair the restored repository. The Mercurial wiki has more information on repairing repository corruption.

The completely safe way to backup a repository is of course to use hg clone, but it might not be practical to integrate this with a general backup strategy.

Upvotes: 3

Mark Drago
Mark Drago

Reputation: 2066

The short answer is: You can copy (cp, rsync, etc.) a mercurial repository without problems.

The longer answer is: https://www.mercurial-scm.org/wiki/Presentations?action=AttachFile&do=get&target=ols-mercurial-paper.pdf (in particular section 5, sub-heading "Committing Changes").

Mercurial writes out changes in an order that makes it safe for any other process to read a mercurial repository at any time. If you copy a repository to some other location while a change is being made to the repository, you'll get some of the new data, but mercurial is smart enough to ignore partially written commits. When you use the copy you made as a mercurial repository you will either see the new commit or not, there will not be any corruption.

Upvotes: 0

zerkms
zerkms

Reputation: 255005

Why don't "backup" it with just hg clone? ;-)

Upvotes: 0

Related Questions