Reputation: 111
What kind of an architecture is needed to store 100 TB data and query it with aggregation? How many nodes? Disk size per node? What can the best practice be?
Every day 240GB will be written but the size will remain same because the same amount data will be deleted.
Or any different thoughts about storing the data and fast group queries?
Upvotes: 10
Views: 10104
Reputation: 9571
I highly recommend HBase.
Facebook uses it for its Messages service, which in Nov 2010 was handling 15 billion messages a day.
We tested MongoDB for a large data set but ended up going with HBase and have been happily using it for months now.
Upvotes: 4
Reputation: 79032
Kindly refer to related question,
Quoting from the the top answer:
The "production deployments" page on MongoDB's site may be of interest to you. Lots of presentations listed with infrastructure information. For example:
http://blog.wordnik.com/12-months-with-mongodb says they're storing 3 TB per node.
Upvotes: 4