Benedikt B
Benedikt B

Reputation: 753

Ruby MRI 1.8.7 - File writing thread safety

It seems to me that file writing in Ruby MRI 1.8.7 is completely thread safe.

Example 1 - Flawless Results:

File.open("test.txt", "a") { |f|
  threads = []
  1_000_000.times do |n|
    threads << Thread.new do
      f << "#{n}content\n"
    end
  end
  threads.each { |t| t.join }
}

Example 2 - Flawless Results (but slower):

threads = []
100_000.times do |n|
  threads << Thread.new do
    File.open("test2.txt", "a") { |f|
      f << "#{n}content\n"
    }
  end
end
threads.each { |t| t.join }

So, I couldn't reconstruct a scenario where I face concurrency problems, can you?

I would appreciate if somebody could explain to me why I should still use Mutex here.

EDIT: here is another more complicated example which works perfectly fine and doesn't show concurrency problems:

def complicated(n)
  n.to_s(36).to_a.pack("m").strip * 100
end

items = (1..100_000).to_a

threads = []
10_000.times do |thread|
  threads << Thread.new do
    while item = items.pop

      sleep(rand(100) / 1000.0)
      File.open("test3.txt", "a") { |f|
        f << "#{item} --- #{complicated(item)}\n"
      }

    end
  end
end
threads.each { |t| t.join }

Upvotes: 4

Views: 2592

Answers (1)

kaspernj
kaspernj

Reputation: 1253

I was not able to produce an error either.

You are probably running into a file lock here. If you wish for multiple threads to write to the same file, they should all use the same file-object like so:

File.open("test.txt", "a") do |fp|
  threads = []
  
  500.times do |time|
    threads << Thread.new do
      fp.puts("#{time}: 1")
      sleep(rand(100) / 100.0)
      fp.puts("#{time}: 2")
    end
  end
  
  threads.each(&:join)
end

The GIL will probably save you from any real thread-bugs in this example, but I am not really sure, what would happen under JRuby, which is using real threads and two writes might occur at the exact same time. The same goes for other Ruby-engines with real threadding.

Regarding the question of wherever you should protect your code with locks comes down to, if you want to rely on the Ruby-engine you are using should save you, or you want to code a solution that "should" work on all Ruby-engines, regardless of if they have built-in functionality to save you from concurrency problems.

Another question is if your operation-system and/or file-system is saving you from thread-bugs with file locks, and if your code should be operating-system and/or file-system independent, meaning that you wont depend on file-system locks to ensure, that your file-opening and writes are being properly synchronized by the operation-system and/or file-system.

I will go out on a limb and say, that it seems like good practice, that you also implement locks on your side, to ensure that your code keeps working regardless of which Ruby-engine, operating-system or file-system someone else is going to use your code on, even though most modern Ruby-engines, operating-systems and file-systems have these features built-in.

Upvotes: 3

Related Questions