Reputation:
Can anyone suggest a database solution for storing large documents which will have multiple branched revisions? Partial edits of content should be possible without having to update the entire document.
I was looking at XML databases and wondering about the suitability of them, or maybe even using a DVCS (like Mercurial).
It should preferably have Python bindings.
Upvotes: 0
Views: 321
Reputation: 2054
This depends on your storage behavior and use case. If you plan to store a massive number of "document revisions" and keep historical versions, and can comply with a write-once-read-many pattern, you should look into something like Hadoop HDFS. This requires a lot of (cheap) infrastructure to run your cluster, but you will be able to keep adding revisions/data over time and will be able to quickly look it up using a MapReduce algorithm.
Upvotes: 0
Reputation: 41200
Try Fossil -- it has a good delta encoding algorithm, and keeps all versions. It's backed by a single SQLite database, and has both a web based and a command line UI.
Upvotes: 1