Reputation: 545
I am working on a mule application that fetches hundreds of thousands of records from database, do a map to change the structure of the incoming records using dataweave and insert the data in Salesforce. The steps I follow are:
How to optimize this process? Will holding so many records in variables impact the performance of the application? Is there a better approach for this?
Upvotes: 2
Views: 2491
Reputation: 4473
It looks that there is no need to keep records in the memory. You just have to process them all. Right?
One of the ways is to use Watermarks. Mark what you did in the database and process the rest later. Mule has inbuilt capabailiteis to work with watermarks. https://docs.mulesoft.com/connectors/object-store/object-store-to-watermark
Even more simple way (still process records in steps) is to figure out some order (like time), work on some subset in this order (like for one year) and then start next step based on data what you already had transferred to the destination. This is even better way because if process fails you can continue it later base on the data which already was transferred. Such pagination process could be spread over time, over servers, over multithreads too.
The best way to save memory is not to use variables but use payload. By default Mule's payload is a stream and so it does not actually matter how big data is - it automatically flows through needle ear of the stream processing. Try to avoid storing even small pieces of the stream in variables/memory. Eventually this storage will overflow.
Upvotes: 2