Reputation: 4420
I've written some code in ruby to process items in an array via a threadpool. In the process, I've preallocated a results array which is the same size as the passed-in array. Within the threadpool, I'm assigning items in the preallocated array, but the indexes of those items are guaranteed to be unique. With that in mind, do I need to surround the assignment with a Mutex#synchronize
?
Example:
SIZE = 1000000000
def collect_via_threadpool(items, pool_count = 10)
processed_items = Array.new(items.count, nil)
index = -1
length = items.length
mutex = Mutex.new
items_mutex = Mutex.new
[pool_count, length, 50].min.times.collect do
Thread.start do
while (i = mutex.synchronize{index = index + 1}) < length do
processed_items[i] = yield(items[i])
# ^ do I need to synchronize around this? `processed_items` is preallocated
end
end
end.each(&:join)
processed_items
end
items = collect_via_threadpool(SIZE.times.to_a, 100) do |item|
item.to_s
end
raise unless items.size == SIZE
items.each_with_index do |item, index|
raise unless item.to_i == index
end
puts 'success'
(This test code takes a long time to run, but appears to print 'success' every time.)
It seems like I would want to surround the Array#[]=
with Mutex#synchronize
just to be safe, but my question is:
Within Ruby's specification is this code defined as safe?
Upvotes: 4
Views: 457
Reputation: 22335
Nothing in Ruby is specified to be thread safe other than Mutex
(and thus anything derived from it). If you want to know if your specific code is thread safe, you'll need to look at how your implementation handles threads and arrays.
For MRI, calling Array.new(n, nil)
does actually allocate memory for the entire array, so if your threads are guaranteed to not share indices your code will work. It's as safe as having multiple threads operate on distinct variables without a mutex.
However for other implementations, Array.new(n, nil)
might not allocate a whole array, and assigning to indices later could involve reallocations and memory copies, which could break catastrophically.
So while your code may work (in MRI at least), don't rely on it. While we're on the topic, Ruby's threads aren't even specified to actually run in parallel. So if you're trying to avoid mutexes because you think you might see some performance boost, maybe you should rethink your approach.
Upvotes: 1