Reputation: 289
I have roughly 2.6 million records that need to be updated (using a PUT) externally by making a request. This is only going to be a one time thing, and so I have the following:
@hydra ||= Typhoeus::Hydra.hydra
million_records.each do |id|
typhoeus_request = Typhoeus::Request.new(
url: "http://localhost:300/posts/#{id}"
headers: {'content-type' => 'application/json'},
params: {field1: 'Hello World'}
method: :put
)
@hydra.queue typhoeus_request
end
@hydra.run
I'd read the documentation surrounding parallel requests and it states:
Hydra will also handle how many requests you can make in parallel. Things will get flakey if you try to make too many requests at the same time. The built in limit is 200. When more requests than that are queued up, hydra will save them for later and start the requests as others are finished.
My question is, are there any performance flaws with the above? If so how can I improve the above so its more performant.
Or another suggestion would be, for each iteration create a new hyrda instance queue it and push the hydra instance into an array and then go through them using the Parallel
gem. For example:
batches = []
million_records.each do |id|
hydra ||= Typhoeus::Hydra.hydra
typhoeus_request = Typhoeus::Request.new(
url: "http://localhost:300/posts/#{id}",
params: {field1: 'Hello World'},
headers: {'content-type' => 'application/json'},
method: :put
)
hydra.queue typhoeus_request
batches.push(hydra)
end
Parallel.each(batches, in_threads: 5) do |batch|
batch.run
end
Upvotes: 4
Views: 552