Reputation: 5351
It is possible to filter the SVN dump, generated by svndamin dump
, so it will not included encoded binary data, just the text deltas and data?
I want to have a dump of an existing large SVN repositories, but only of the code. I have no interest in the stored binaries. However, binary files will make the dump file unnecessarily large. How can I generate the dump and exclude binary content?
Tried and failed, already:
svn log
diffs. It is a large and old repository, and getting diffs only for a short time period takes a lot of time and often gets stuck.svndumpfilter
to exclude them - Unless there is some way to use this filter with regular expressions, e.g. *.jar
.Upvotes: 4
Views: 1046
Reputation: 341
I don't know of a stock tool to do this. But it shouldn't be hard to do if you start with this perl module: SVN::Dumpfilter
One of the example scripts in there (svndump_delpathfilter) is probably pretty close to what you want. My experience with this module is that you'll probably have to tinker with it a bit to get it to do what you want.
Now, I don't think there is any way to reliably tell a binary from a text file, since Subversion (at the lowest levels) doesn't really care. A quick scan of my repository shows that the svn:mime-type property isn't always set, and I see no other indicative fields. So you'll have to check via name or (somehow) try looking at the contents of the file (but I have never done the latter).
Upvotes: 1
Reputation: 97282
svndumpfilter is part of any Subversion installation
svndumpfilter exclude
— Filter out nodes with given prefixes from the dump stream.
Beginning in Subversion 1.7, svndumpfilter can optionally treat the PATH_PREFIXs not merely as explicit substrings, but as file patterns instead.
$ svndumpfilter exclude --pattern "*.OLD" < dumpfile > filtered-dumpfile
Excluding prefix patterns:
'/*.OLD'
Upvotes: 4