Reputation: 61

Large scale storage for incrementally-appended documents?

I need to store hundreds of thousands (right now, potentially many millions) of documents that start out empty and are appended to frequently, but never updated otherwise or deleted. These documents are not interrelated in any way, and just need to be accessed by some unique ID.

Read accesses are some subset of the document, which almost always starts midway through at some indexed location (e.g. "document #4324319, save #53 to the end").

These documents start very small, at several KB. They typically reach a final size around 500KB, but many reach 10MB or more.

I'm currently using MySQL (InnoDB) to store these documents. Each of the incremental saves is just dumped into one big table with the document ID it belongs to, so reading part of a document looks like "select * from saves where document_id=14 and save_id > 53 order by save_id", then manually concatenating it all together in code.

Ideally, I'd like the storage solution to be easily horizontally scalable, with redundancy across servers (e.g. each document stored on at least 3 nodes) with easy recovery of crashed servers.

I've looked at CouchDB and MongoDB as possible replacements for MySQL, but I'm not sure that either of them make a whole lot of sense for this particular application, though I'm open to being convinced.

Any input on a good storage solution?

Upvotes: 6

Answers (5)

Eugene Mayevski 'Callback

Reputation: 46060

Check our SolFS virtual file system. It will work well in your conditions.

Upvotes: 0

Gates VP

Reputation: 45287

OK, first a caveat, MongoDB does have a limitation on document size. However, the newest version will cover your 10MB size.

So some useful points for MongoDB.

Ideally, I'd like the storage solution to be easily horizontally scalable, with redundancy across servers (e.g. each document stored on at least 3 nodes) with easy recovery of crashed servers.

For replication, MongoDB supports replica sets. Replica sets are single-master replicas. If the master goes down the system automatically elects a new master (easy recover). Adding a new node is as simple as starting up a new server and pointing at the existing set.

For horizontal scalability, MongoDB supports sharding. Sharding is a little more complex, but works like you would expect it to, splitting writes across multiple machines (or multiple replica sets).

I need to store hundreds of thousands (right now, potentially many millions) of documents that start out empty and are appended to frequently

Several companies have Mongo running billions of documents in production.

Mongo provides a series of update modifiers that are very useful in the case of "appended to". In particular check out the $push operator that adds to the end of an array. Should be exactly what you need.

Read accesses are some subset of the document, which almost always starts midway through at some indexed location (e.g. "document #4324319, save #53 to the end").

MongoDB allows you to return only select fields (as expected). Depending on your layout you can use dot notation to retrieve only certain sub-documents. If your updates are implemented as arrays, you can also use the $slice command which is well suited to the query you list above.

So I think that MongoDB meets all of your basic needs here. Easy to append, easy to query those appends and the replication is built in. You get horizontal scaling via sharding (try starting first with a replica)

Upvotes: 0

Mufaka

Reputation: 3444

My immediate thought is why store these in a database? Does storing these in a database result in better seek performance than a filesystem when dealing with so many files?

I'd think storing these on a filesystem in a hashed directory structure would be better. You can use the database to store only meta data (root directories, document id, save id, location relative to root).

The root directories (nodes) would be a separate table and could be used when writing (enumerate and write to all locations) and then round robin (or other load balancing algorithm) for reading.

If a node is unreachable or a file doesn't exist, the load balancing could "fail over" to the next in line. Root directories could also be marked offline for planned outages if the read / write code respected that. The same could also be used for partitioning where x number of root directories serve odd id's and x number serve even id's as a simple example.

Ensuring the nodes are in sync could be coded using the meta data as well.

Just my 2 cents as I've never dealt with that volume of files before.

Upvotes: 0

Brendan

Reputation: 37232

Is there any reason you need a database at all?

You describe "a system to store documents with unique names" so I started thinking "file system". Maybe something like enterprise class file server/s (I estimated a maximum of about 200 TiB of data), where the unique ID is a directory and file name on the network.

Upvotes: 0

Lior Cohen

Reputation: 9055

Sounds like an ideal problem to be solved by HBase (Over HDFS).

The downside is the somewhat steep learning curve, amongst others.

Upvotes: 1

Large scale storage for incrementally-appended documents?

Answers (5)

Related Questions