Reputation: 2341
I am fairly new to ruby mutli-threading and was confused on how to get started. I am currently building an app and it needs to fetch a LOT of images so I want to do it in a different thread. I wanted the program to execute as shown in the code below.
PROBLEM: The problem I see here is that the bar_method will get done fetching faster and the thread will end so things will keep getting added to the queue but won't be processed. Is there any way of synchronization possible that will alert the bar_method thread that a new item has been added to the queue and if bar_method does finish earlier it should go to sleep and wait on a new item to be added to the queue?
def foo_method
queue created - consists of url to fetch and a callback method
synch = Mutex.new
Thread.new do
bar_method synch, queue
end
100000.times do
synch.synchronize do
queue << {url => img_url, method_callback => the_callback}
end
end
end
def bar_method synch_obj, queue
synch_obj.synchronize do
while queue isn't empty
pop the queue. fetch image and call the callback
end
end
end
Upvotes: 3
Views: 87
Reputation: 160571
If you need to retrieve files from the internet, and use parallel requests, I'll highly recommend Typhoeus and Hydra.
From the documentation:
hydra = Typhoeus::Hydra.new
10.times.map{ hydra.queue(Typhoeus::Request.new("www.example.com", followlocation: true)) }
hydra.run
You can set the number of concurrent connections in Hydra:
:max_concurrency (Integer) — Number of max concurrent connections to create. Default is 200.
As a second recommendation look into Curb. Again, from its documentation:
# make multiple GET requests
easy_options = {:follow_location => true}
multi_options = {:pipeline => true}
Curl::Multi.get('url1','url2','url3','url4','url5', easy_options, multi_options) do|easy|
# do something interesting with the easy response
puts easy.last_effective_url
end
Both are built on top of Curl, so there's no real difference in their underlying technology or its robustness. The difference is the commands available to you.
Another gem that gets a lot of attention is EventMachine. It has EM-HTTP-Request which allows concurrent requests:
EventMachine.run {
http1 = EventMachine::HttpRequest.new('http://google.com/').get
http2 = EventMachine::HttpRequest.new('http://yahoo.com/').get
http1.callback { }
http2.callback { }
end
Upvotes: 2