Reputation: 67
I am researching on the possibilities of migrating data from SQL 2012 to mongoDB. And my manager specifically asked me to see the time it takes to process billions of rows in SQL and MongoDB to make a decision to migrate or not. Any recommendations or suggestions or places I should visit to research more? So far I have done
Questions I have now 3. how to import the database in SQL to Mongo (say migrate Adventure Works)
Thanks In Advance!
Upvotes: 4
Views: 14560
Reputation: 10040
I've created a Node.js script that replicates an SQL database to MongoDB.
You can find it here.
To use...
Clone the repo:
git clone https://github.com/ashleydavis/sql-to-mongodb
Install dependencies:
cd sql-to-mongodb
npm install
Setup your config:
Edit config.js. Add your SQL connection string and details for your MongoDB database.
Run the script:
node index.js
This can take a while to complete! But when it does you will have a copy of your SQL database in MongoDB. Please let me know if there are any issues.
Upvotes: 5
Reputation: 20703
Some best practices I learned the hard way.
When planning a MongoDB cluster, you need to have an idea how big the average document size is. In order to do that, import some 10k records of your data. This gives you an idea on how long the actual import will take in orders of magnitude:
where t is the time it took to import n documents of c.
Repeat this for all target collections. After that, issue a
db.stats()
in the mongo shell. You will be presented with some size statistics. You now have approximations to two key factors: the time it takes to import (by summing up the results of above calculation) and the storage space you will need.
Create the indices you are going to need. As for time calculations, the same as above applies. But there is a different thing: indices should reside in RAM, so you need to extrapolate the actual RAM you need when all records are migrated.
Chances are that it isn't cost efficient to store all data on one machine, since RAM is getting costly after a certain point (calculation are necessary here). If that is the case, you need to shard.
It can not be overemphasized how important it is to have a proper shard key right from the start: Shard keys can not be changed. Invest some time with the developers to find a proper shard key.
The last thing you want during data migration is to have it being delayed by the balancer trying to balance out the chunks. So you should pre-split your chunks and distribute them among your shards.
Upvotes: 4