Rojj
Rojj

Reputation: 1210

Ruby arrays and Threads

I am populating an array with the responses from API requests. In order to speed up the process I am using threads. I know they are not (really) parallel, but they do make the overall process faster.

To maintain the order I first define an empty array and then I populate the specific slots. This is the simplified code

instances = [array of instances]
number_of_instances = instances.size
case_run_status["machine_statuses"] = Array.new(number_of_instances){{}}

threads = []
instances.each_with_index do |instance, i|

  threads << Thread.new do 
    machine_status = {}

    machine_status["ip"] = instance.public_ip_address



    uri = "request...." 

    response = HTTParty.get(uri)        
    status = JSON.parse(response.body)  

    machine_status["running"] = status['running']
    machine_status["running_node"] = status['running_node']

    case_run_status["machine_statuses"][i] = machine_status

  end
end
threads.each{|thr| thr.join }

From what I understand this should be thread safe. Is this correct? However, the problem that I am having is that, apparently randomly, machine_status["running"] and machine_status["running_node"] get mixed up and the value status['running'] ends up in machine_status["running_node"].

If I remove the Threads and execute the code serially everything works as expected.

Question: Is this the right way to safely populate an array with Threads?

Upvotes: 2

Views: 1395

Answers (2)

Shivansh Gaur
Shivansh Gaur

Reputation: 938

None of the core data structures (except for Queue) in Ruby are thread-safe. The structures are mutable, and when shared between threads, there are no guarantees the threads won’t overwrite each others’ changes.

Upvotes: 0

cavin kwon
cavin kwon

Reputation: 501

I recommend you concurrent-ruby.

  • install gem install concurrent-ruby
  • sample code
require 'concurrent'

def api_call(url)
  sleep 1
  # call api
  puts url
  url
end

def async_call(urls)
  jobs = urls.map do |url| 
    Concurrent::Promises.future { api_call(url) }
  end

  before = Time.now
  p Concurrent::Promises.zip(*jobs).value 
  puts Time.now - before
end

In the following code, the url call runs randomly asynchronously. The result is then sorted in the same order as the array.

urls = %w(a b c d e)
async_call(urls)

c 
d 
b 
e 
a 
["a", "b", "c", "d", "e"]
1.0021356

Upvotes: 1

Related Questions