Reputation: 1
Every night I need to get data from external http service and save it to Azure Data Lake. Actually, I need to get all the orders for all the customers. The problem is that there is now way to get this data via a single call. Id of a customer should be provided per each separate call.
The format of url is something like /api/ordersByCutomer/{cutomerId}
I need to get data for 100 000 different customers. It will result in 100 000 calls to the external service.
I tried to use Azure Data Factory with Foreach activity in parallel mode, but it takes 4 sec per each call there (3 seconds are spent in queue). The overall speed result was not satisfying.
What is the best (I mean the fastest) azure-based approach for this (except Azure Data Factory)?
Thanks
Upvotes: 0
Views: 97
Reputation: 31
You could write some asynchronous code to hit the API/http service parallelly and execute this code using custom activity in ADF which works by using batch account to get this job done. Use custom activities in an Azure Data Factory
Also, before doing any of this it would nice to contact the owner/stakeholder of the external http service and finding out if there is rate limiting on that service and even if the service can handle such loads.
Upvotes: 2