bl0b
bl0b

Reputation: 936

Ruby Multi threading, what am I doing wrong?

So, in order to improve to speed of our app I'm experimenting multi threading with our rails app. Here is the code:

require 'thwait'
require 'benchmark'

city = Location.find_by_slug("orange-county", :select => "city, state, lat, lng", :limit => 1)
filters = ContractorSearchConditions.new()
image_filter = ImageSearchConditions.new()
filters.lat = city.lat
filters.lon = city.lng
filters.mile_radius = 20
filters.page_size = 15
filters.page = 1
image_filter.page_size = 5
sponsored_filter = filters.dup
sponsored_filter.has_advertised = true
sponsored_filter.page_size = 50
Benchmark.bm do |b|
  b.report('with') do
    1.times do
      cities = Thread.new{ 
        Location.where("lat between ? and ? and lng between ? and ?", city.lat-0.5, city.lat+0.5, city.lng-0.5, city.lng+0.5)
      }
      images  = Thread.new{ 
        Image.search(image_filter)[:hits]
      }
      sponsored_results_extended = Thread.new{ 
        sponsored_filter.mile_radius = 50
        @sponsored_results = Contractor.search( sponsored_filter )
      }
      results = Thread.new{
        Contractor.search( filters )
      }
      ThreadsWait.all_waits(cities, images, sponsored_results_extended, results)
      @cities = cities.value
      @images = images.value
      @sponsored_results = sponsored_results_extended.value
      @results = results.value
    end
  end
  b.report('without') do
    1.times do
      @cities = Location.where("lat between ? and ? and lng between ? and ?", city.lat-0.5, city.lat+0.5, city.lng-0.5, city.lng+0.5)
      @image = Image.search(image_filter)[:hits]
      @sponsored_results = Contractor.search( sponsored_filter )
      @results = Contractor.search( filters )
    end
  end
end

Class.search is running a search on our ElasticSearch servers.(3 servers behind a Load balancer), where active record queries are being runned in our RDS instance.

(Everything is in the same datacenter.)

Here is the output on our dev server:

Bob@dev-web01:/usr/local/dev/buildzoom/rails$ script/rails runner script/thread_bm.rb -e development
       user     system      total        real
with  0.100000   0.010000   0.110000 (  0.342238)
without  0.020000   0.000000   0.020000 (  0.164624)

Nota: I've a very limited knowledge if no knowledge about thread, mutex, GIL, ..

Upvotes: 0

Views: 302

Answers (2)

davogones
davogones

Reputation: 7399

Even though you are using threads, and hence performing query IO in parallel, you still need to deserialize whatever results are coming back from your queries. This uses the CPU. MRI Ruby 2.0.0 has a global interpreter lock. This means Ruby code can only run one line at a time, not in parallel, and only on one CPU core. In order to deserialize all your results, the CPU has to context switch many times between the different threads. This is a lot more overhead than deserializing each result set sequentially.

If your wall time is dominated by waiting for a response from your queries, and they don't all come back at the same time, then there might be an advantage to parallelizing with threads. But it's hard to predict that.

You could try using JRuby or Rubinius. These will both utilize multiple cores, and hence can actually speed up your code as expected.

Upvotes: 1

Wizard of Ogz
Wizard of Ogz

Reputation: 12643

There is a lot more overhead in the "with" block than the "without" block due to the Thread creation and management. Using threads will help the most when the code is IO-bound, and it appears that is NOT the case. Four searches complete in 20ms (without block), which implies that in parallel those searches should take less that amount of time. The "with" block takes 100ms to execute, so we can deduce that at least 80ms of that time is not spent in searches. Try benchmarking with longer queries to see how the results differ.

Note that I've made the assumption that all searches have the same latency, which may or may not be true, and always perform the same. It may be possible that the "without" block benefits from some sort of query caching since it runs after the "with" block. Do results differ when you swap the order of the benchmarks? Also, I'm ignoring overhead from the iteration (1.times). You should remove that unless you change the number of iterations to something greater than 1.

Upvotes: 1

Related Questions