Clay Nichols
Clay Nichols

Reputation: 12139

Delete older .Svn files from reposititory

My .svn repository is getting quite large (5 GB) and we really don't need to go back quite so far with it. ( 6 months or a year is find).

I also have on 8 GB .svn folder at the root of the directory that is checked out of the repository.

I would even settle for "starting over" and keeping a copy of the old SVN around for 6 months or a year and then eventually deleting it per How to backup and restore all the source code in svn?

Upvotes: 1

Views: 292

Answers (4)

Warren Young
Warren Young

Reputation: 42333

If you just want to start over, I would go about it this way:

  1. Check out tip-of-trunk without any .svn files:

    $ svn export file:///path/to/current/repository old-trunk
    
  2. Weed anything out of that checkout that you don't want to be in the new repository. As others have commented, you probably have a lot of large binary files in the repo currently that really don't belong there.

    You might find my pigs script helpful in that hunt:

     #!/bin/sh
     du -skL "$@" -- * | sort -n
    
  3. Create a new repo from that clean tip checkout:

    $ svnadmin create /path/to/new/clean/repository
    $ svn import old-trunk file:///path/to/new/clean/repository \
      -m "Tip of old repo trunk as of 2015.04.14, r12345"
    
  4. Move your old checkouts aside temporarily, then make fresh checkouts from the new clean repository. Keep the old checkouts until you are certain you have what you need. Even if you keep the old repository as well, it is good to have at least one known-working checkout of it as well.

Upvotes: 0

bahrep
bahrep

Reputation: 30662

It looks like that your are confusing your local working copy with a repository, therefore it's unclear what exactly you ask about.

If you use Subversion 1.7 or newer working copy, then it should contain only one .svn directory at the root directory. .svn is an administrative directory and you should never touch it manually. In fact, it does not contain full revision history as you seem to expect. Quoting SVNBook:

The files in the administrative directory help Subversion recognize which of your versioned files contain unpublished changes, and which files are out of date with respect to others' work.

I guess that the fact that .svn directory takes 8GB means that you checked out the whole repository. Did you? And do you really need to have a working copy of the whole repository? Usually you should checkout only trunk or a branch of a project stored in a repository and such working copy will be much less in size. @David provides great summary of this in his answer.

Upvotes: 0

Bryan Shaw
Bryan Shaw

Reputation: 106

One option would be to use the svnadmin tool's dump command (as demonstrated in your link), but give it a start revision of the point at which you are willing to cut off data. This will cause that start revision to be dumped as if it were the addition of a new tree (i.e. all files in full as of that revision). This gives you a record of the most recent X months of committed revisions. You could use the --deltas option to reduce the size of the dump file. See http://svnbook.red-bean.com/en/1.7/svn.ref.svnadmin.c.dump.html.

You could then create a new repository and feed this dump file into it via the load command to have a new repository with just the most recent data in it that you want.

Personally I don't recommend this, as you never know when that older data may come in handy, but I don't know your exact situation and this is one way to accomplish what I think you are asking for.

Upvotes: 1

David W.
David W.

Reputation: 107040

What do you mean your .svn repository?

That .svn folder is mainly used to manage the checked out version and has absolutely nothing to do with the history of your repository server.

The .svn directory contains information like what files on the client changed, who did the checkout, and the URL. In pre-1.7 versions of Subversion, it even kept a complete copy of the checked out directory. This way, you could do a diff to see the changes you made without talking to the server. That meant if you checked out 100Mb of files, your .svn directory would be about 100Mb too.

If you're talking about the client, you only need to checkout the part of the URL that you need to work on. For example, let's say you have the standard Subversion repository setup like this:

  • http://%REPO_URL%/trunk
  • http://%REPO_URL%/tags
  • http://%REPO_URL%/branches

Under trunk, you have all of your projects:

  • http://%REPO_URL%/trunk/project_foo
  • http://%REPO_URL%/trunk/project_bar
  • http://%REPO_URL%/trunk/project_fubar

I don't have to checkout http://%REPO_URL%/trunk if I'm only working in project_foo. I certainly don't want to checkout http://%REPO_URL% which will give me my entire repository including all branches and tags fully checked out. (And I've seen people who had done this).

A Subversion client doesn't checkout the entire repository, but just a single version of the project. If you checkout what you need, you could have a repository that's hundreds of terabytes in size, but you're working copy probably isn't over a gigabyte in size.

One issue I've seen is people checking in binary code -- either third party libraries or compiled code. This code should not be part of your repository. If you use Java, use Maven, Gradle, or Ant with Ivy to manage these third party libraries and your own built objects that your project might use. If you use .NET, use NuGet to do the same.

Subversion stores files in a diff format. If one version differs from another by a single line, only that line change is stored in Subversion. Although that single source change might be a single line, it could have major repercussions in the built file. It isn't unusual for binary files to take up over 90% of space of a Subversion repository. That is, a repository that should be about 500 megabytes in size would swell to over 50 Gigabytes because of the binary files.

Even worse, binary files quickly become obsolete, and Subversion has no easy way of removing the obsolete version. Besides, there are no tools in Subversion that can help you analyze your binaries. Diffing between two binary versions is meaningless. The author has no relevance except that's who built and checked in the version -- not necessarily the person who should be contacted about any questions (which is a nice way of saying the blame).

I hope this answers your question. Checkout only what you need, and your .svn directory will be that much smaller. Don't store binary files in Subversion, and your .svn directory won't have to reference them. If these don't help, look into sparse checkouts which can eliminate tracking files you don't need.

Upvotes: 2

Related Questions