user193476
user193476

Reputation:

Database for storing large documents

Can anyone suggest a database solution for storing large documents which will have multiple branched revisions? Partial edits of content should be possible without having to update the entire document.

I was looking at XML databases and wondering about the suitability of them, or maybe even using a DVCS (like Mercurial).

It should preferably have Python bindings.

Upvotes: 0

Views: 321

Answers (2)

Moe Matar
Moe Matar

Reputation: 2054

This depends on your storage behavior and use case. If you plan to store a massive number of "document revisions" and keep historical versions, and can comply with a write-once-read-many pattern, you should look into something like Hadoop HDFS. This requires a lot of (cheap) infrastructure to run your cluster, but you will be able to keep adding revisions/data over time and will be able to quickly look it up using a MapReduce algorithm.

Upvotes: 0

Doug Currie
Doug Currie

Reputation: 41200

Try Fossil -- it has a good delta encoding algorithm, and keeps all versions. It's backed by a single SQLite database, and has both a web based and a command line UI.

Upvotes: 1

Related Questions