Reputation: 1496
I have a quite conceptual question about working with APIs which handle huge amount of data.
I have two web applications running:
both use RESTful API to communicate with each other.
Now it comes to sending huge amounts of data (about 1 million rows) from A to B.
My concept is to select all data from database and i.e. create one main-array with 1 million subarrays. Next iterate through the main-array to get every single row and send this specific row as Post-Request to System B.
Code Sample:
$data = $sql-to-get-all-data;
$mainArray = $data->fetchAll();
foreach ($mainArray as $subArray) {
$row = $subArray;
// create post request and send $row to System B
}
Next I would need to check if the row already exists in System B. If it does, send an Update-Request instead of a Post-Request. Also ... if there is a row which exists in System B, but does not exist in System A, it should be removed from System A.
My question is:
Is this the right way to go with? Should I think about concepts like parallel processing / multithreading?
Upvotes: 0
Views: 242
Reputation: 1280
You should use offset and limit in your db query. Storing entire dataset in a PHP array is a really bad idea.
You can send subsets in batches and return id lists (added/updated/removed) within response.
Upvotes: 1