Reputation: 79
I am loading mysql table from a mongodb source through kettle.
Mongodb table has more than 4 million records and when I run the kettle job it takes 17 hours to finish the first time load.
Even for incremental load it takes more than a hour.I tried with increasing commit size and also giving more memory to the job, but still performance is not improving. I think JSON
input step takes a very long time to parse the data and hence its very slow.
I have these steps in my transformation
Same 4 million records when extracted from postgre was way more fast than mongodb. Is there a way I can improve the performance? Please help me.
Thanks, Deepthi
Upvotes: 1
Views: 1423
Reputation: 5164
Run multiple copies of the step. It sounds like you have mongo input then a json input step to parse the json results right? So use 4 or 8 copies of the json input step ( or more depending on cpu's) and it'll speed up.
Alternatively do you really need to parse the full json, maybe you can extract the data via a regex or something.
Upvotes: 0