Arti
Arti

Reputation: 407

How to store many data in DB effectively?

I don't have any experience dealing with many records. Previously, I used MySql to store articles, comments and other simple data. Now I need to write an application in Ruby On Rails which would push to the database dozens of records every minute.

This application is going to watch some websites and measure a couple of their parameters. The websites will be watched 24x7. If I want to observe 100 websites, I will have in my database table about ~144000 (100*60*24) newly added records every day.

I don't want just to store them in the database, but also work with them, combining results, charts etc.

I guess that it is not very wise to save those records like comments in a blog system. Should I store data without any "fancy" solutions?

Upvotes: 0

Views: 144

Answers (1)

AaronO
AaronO

Reputation: 495

I would suggest you use MongoDB, I currently have a few instances of it running storing millions of documents. Contrarily to your usual relational schema, you could use MongoDB's support of subdocuments.

MongoDB is a NoSQL database, but it's very easy to use and to get started with if you are in any way familiar with databases and your day to day OO.

MongoDB has drivers for most popular languages, including Ruby so you can install the driver for Ruby (assuming that's what you want to use) by running the following command using gem :

gem install mongo

And if running Debian/Ubuntu simply isntall mongodb with the following :

apt-get install mongodb

That should get you started.

Depending on the size of each record, you can adopt various different designs, MongoDB currently has a limit of 16MB per document so that's something you need to be aware of.

I would have a collection named websites, for storing the basic info about the sites you are monitoring (name, url, reference to client, ...)

Then have another collection that would be used to store your stats, that could be called "stats", I would store each of your subdocuments (stats every minute) inside another document that would hold the data for one day.

So each document in stats would be something like :

{
  _id : X,
  website_id : STORE_THE_ID_HERE,
  timestamp_start : PUT_THE_TIMESTAMP_WHEN_YOU_START_MONITORING_HERE,
  timestamp_end : SAME_AS_ABOVE_REALLY,
  stats_count : A_DENORMALIZED_COUNT_OF_STATS
  stats : [ { views : X, clicks : X2, ... },
       ...
  ]
}

Hope that gets you started, I think MongoDB fits perfectly your requirements if you are willing to give it a try.

Upvotes: 2

Related Questions