Little Bobby Tables
Little Bobby Tables

Reputation: 5351

svnadmin dump of text only, without binary files

It is possible to filter the SVN dump, generated by svndamin dump, so it will not included encoded binary data, just the text deltas and data?

I want to have a dump of an existing large SVN repositories, but only of the code. I have no interest in the stored binaries. However, binary files will make the dump file unnecessarily large. How can I generate the dump and exclude binary content?

Tried and failed, already:

  1. It is not practical to process the svn log diffs. It is a large and old repository, and getting diffs only for a short time period takes a lot of time and often gets stuck.
  2. The binary files are scattered all over, and not stored under a single known path, so I cannot use svndumpfilter to exclude them - Unless there is some way to use this filter with regular expressions, e.g. *.jar.

Upvotes: 4

Views: 1046

Answers (2)

trent
trent

Reputation: 341

I don't know of a stock tool to do this. But it shouldn't be hard to do if you start with this perl module: SVN::Dumpfilter

One of the example scripts in there (svndump_delpathfilter) is probably pretty close to what you want. My experience with this module is that you'll probably have to tinker with it a bit to get it to do what you want.

Now, I don't think there is any way to reliably tell a binary from a text file, since Subversion (at the lowest levels) doesn't really care. A quick scan of my repository shows that the svn:mime-type property isn't always set, and I see no other indicative fields. So you'll have to check via name or (somehow) try looking at the contents of the file (but I have never done the latter).

Upvotes: 1

Lazy Badger
Lazy Badger

Reputation: 97282

svndumpfilter is part of any Subversion installation

svndumpfilter exclude — Filter out nodes with given prefixes from the dump stream.

Beginning in Subversion 1.7, svndumpfilter can optionally treat the PATH_PREFIXs not merely as explicit substrings, but as file patterns instead.

$ svndumpfilter exclude --pattern "*.OLD" < dumpfile > filtered-dumpfile
Excluding prefix patterns:
   '/*.OLD'

Upvotes: 4

Related Questions