Reputation: 4077
I have to export a site with hundreds of thousands of records through REST API calls.
I got all the record IDs that I need to retrieve stored in a mysql db. I have a PHP script get the next ID, make the API call using curl, save the data, mark the ID as complete and then use a to reload the page.
That's kind of slow. Any ideas of how to speed it up?
Upvotes: 1
Views: 1581
Reputation: 13
// Todas url gravadas em array
$url[] = 'http://www.link1.com.br';
$url[] = 'https://www.link2.com.br';
$url[] = 'https://www.link3.com.br';
// Setando opção padrão para todas url e adicionando a fila para processamento
$mh = curl_multi_init();
foreach($url as $key => $value){
$ch[$key] = curl_init($value);
curl_setopt($ch[$key], CURLOPT_NOBODY, true);
curl_setopt($ch[$key], CURLOPT_HEADER, true);
curl_setopt($ch[$key], CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch[$key], CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch[$key], CURLOPT_SSL_VERIFYHOST, false);
curl_multi_add_handle($mh,$ch[$key]);
}
// Executando consulta
do {
curl_multi_exec($mh, $running);
curl_multi_select($mh);
} while ($running > 0);
// Obtendo dados de todas as consultas e retirando da fila
foreach(array_keys($ch) as $key){
echo curl_getinfo($ch[$key], CURLINFO_HTTP_CODE);
echo curl_getinfo($ch[$key], CURLINFO_EFFECTIVE_URL);
echo "\n";
curl_multi_remove_handle($mh, $ch[$key]);
}
// Finalizando
curl_multi_close($mh);
Upvotes: 0
Reputation: 163
Unfortunately trying to emulate thread level parallelism (or really any level of parallelism) is.... ridiculously annoying in PHP. Fortunately, for your particular use case you just need http://php.net/manual/en/function.curl-multi-exec.php
It essentially executes multiple curl handles in parallel (at least the I/O of retrieving the html contents of the page). The example provided in the docs is decent I think, let me know if you need any further help.
Upvotes: 3