Jamison
Jamison

Reputation: 2348

Git or no Git - Large Binary File Versioning, but no merge needed

There are many prior conversations here on StackOverflow about versioning and SCM for binary files in a code base. But I haven't found any information about this particular project and use case I'm researching:

I have a single parent binary file - very large at multiple gigs in size. From that file, I have hundreds of "children" who are the same size, but each child is slightly unique compared to the parent with very small differences.

I'll never need to merge the children into the parent - so I need some serious advice and insight into how to save just the differences between 1 parent and n children:

  1. Save just the binary differences for each child.
  2. When that child is needed (for download, to implement, etc), re-compile it using the parent file + differences.
  3. NO MERGE needed - I'm just interested in saving differences to reduce file size for each child.

I've worked a lot with GIT and I've seen some great messages here on StackOverflow about Git's amazing ability to process binary files for versioning, like this one here.

But my needs are more simple - I want an awesome C or C++ backbone for saving binary file differences, and re-compiling the original files using those differences PLUS a parent file. That's it. Is there any fast solution like GIT but without the extra features?

Many thanks - I'm trying to avoid re-inventing the wheel here.

Upvotes: 4

Views: 1525

Answers (1)

Robie Basak
Robie Basak

Reputation: 6750

It sounds like you want data deduplication rather than version control. If this is the case, try ddar. You can use it to store related binary files and it'll take care of keeping the storage efficient.

Upvotes: 4

Related Questions