Lior S
Lior S

Reputation: 41

Should I move to NoSQL? (big data)

I'm currently researching a very large table (~100 million rows, 35 columns), it's currently stored in SQL db, but the queries I'm running (and they're various) run very, very slow.. so I get it I should probably move to NoSQL db. question is:

  1. How can I tell which (NoSQL) db is best for me?

  2. How can I move my current SQL table to the new NoSQL scheme?

OR should I stay in SQL and just fine tune it?

A few more details: rows will not be added/removed, this is historical data and all of the analysis will be done on that table. plan to run various queries on it. data is numerical.

Upvotes: 4

Views: 1117

Answers (3)

Andreas
Andreas

Reputation: 737

If you are running a query on columns that are not indexed you will be very slow. You can add more indexes to speed them up. If your DB is static this should work.

One major speed up is the usage of map-reduce queries, where aggregations are carried out by multiple processes or computers. NoSQL databases like MongoDB can be used in such ways. But even MySQL has Cluster capabilities nowadays: http://www.mysql.de/products/cluster/scalability.html. SQL Server can be clustered as well.

So I guess the best first shot would be to optimize your indexes in the table to the query. Each argument column to the query (compare, count ...) etc. should be indexed.

If this is not doing any better you probably count and calculate a lot and you should use map-reduce jobs and a DB which can handle this like MongoDB: http://docs.mongodb.org/manual/aggregation/

I hope this helps

Upvotes: 0

Addys
Addys

Reputation: 2501

Some food for thought:

  • 100M rows is well within SQL's "sweet spot". You can grow by x10 and still be assured that SQL will be able to support you with fairly trivial effort.

  • NoSQL is not a silver bullet for solving performance problems at scale. It offers a set of tradeoffs which, with careful planning, can provide better results. But if sounds like you don't fully understand your performance issues in SQL, and without that your chances of making the correct design decisions in a NoSQL environment are slim.

  • One of the common tradeoffs in NoSQL systems is that they typically provide less flexibilty in querying, in return for greater flexibility in schema management. You mentioned your queries are "various"- if they are truly varied, or more importantly- frequently changing - then moving to a NoSQL system can put you in a world of pain. Especially if you are not familiar with the technology yet.

Bottom line- You aren't doing anything which is clearly "beyond" the capabilities of SQL, and your problems are probably caused more by inefficient implementation than by any inherent platform limitations. Moving to a NoSQL system won't magically solve any of your problems, and will probably introduce new ones.

Upvotes: 7

Randy Minder
Randy Minder

Reputation: 48432

I routinely work with a SQL Server 2012 table that has 900 million rows. This table has rows being added to it about every 2 minutes with a total of about 200K per day. I can query this table and get rows back in a couple seconds (using the clustered index / PK). I can also query on one of the other indexes and get results back in seconds or less.

So, it's all a matter of making sure your indexes are set up correctly, AND BEING USED!! Check your queries against the query plan being generated and make sure seeks are being done.

There could be good reasons for moving to NoSQL, or something similar. But moving to NoSQL because you think you can't get good performance in SQL Server, before making sure you've done everything you can do to improve performance first, is not a good reason.

Upvotes: 7

Related Questions