Michael Bishop
Michael Bishop

Reputation: 4420

Is Ruby Array#[]= threadsafe for a preallocated array? Can this be made lockless?

I've written some code in ruby to process items in an array via a threadpool. In the process, I've preallocated a results array which is the same size as the passed-in array. Within the threadpool, I'm assigning items in the preallocated array, but the indexes of those items are guaranteed to be unique. With that in mind, do I need to surround the assignment with a Mutex#synchronize?

Example:

SIZE = 1000000000
def collect_via_threadpool(items, pool_count = 10)
  processed_items = Array.new(items.count, nil)
  index = -1
  length = items.length
  mutex = Mutex.new
  items_mutex = Mutex.new
  [pool_count, length, 50].min.times.collect do
    Thread.start do
        while (i = mutex.synchronize{index = index + 1}) < length do


          processed_items[i] = yield(items[i])
          # ^ do I need to synchronize around this? `processed_items` is preallocated

        end
    end
  end.each(&:join)
  processed_items
end

items = collect_via_threadpool(SIZE.times.to_a, 100) do |item|
  item.to_s
end

raise unless items.size == SIZE

items.each_with_index do |item, index|
  raise unless item.to_i == index
end

puts 'success'

(This test code takes a long time to run, but appears to print 'success' every time.)

It seems like I would want to surround the Array#[]= with Mutex#synchronize just to be safe, but my question is:

Within Ruby's specification is this code defined as safe?

Upvotes: 4

Views: 457

Answers (1)

Max
Max

Reputation: 22335

Nothing in Ruby is specified to be thread safe other than Mutex (and thus anything derived from it). If you want to know if your specific code is thread safe, you'll need to look at how your implementation handles threads and arrays.

For MRI, calling Array.new(n, nil) does actually allocate memory for the entire array, so if your threads are guaranteed to not share indices your code will work. It's as safe as having multiple threads operate on distinct variables without a mutex.

However for other implementations, Array.new(n, nil) might not allocate a whole array, and assigning to indices later could involve reallocations and memory copies, which could break catastrophically.

So while your code may work (in MRI at least), don't rely on it. While we're on the topic, Ruby's threads aren't even specified to actually run in parallel. So if you're trying to avoid mutexes because you think you might see some performance boost, maybe you should rethink your approach.

Upvotes: 1

Related Questions