peter
peter

Reputation: 161

Prevent Rails from caching results of ActiveRecord query

I have a rake task that needs to iterate through a large number of records (called Merchants) which each have a large number of associated items. My problem is that due to Rails automatically caching the results of my DB queries, I end up putting my workers into swap space before very long.

In short, I'm wondering how to run a command like:

Merchant.all.each { |m| items = m.items }

without caching the value of 'items' each time through.

I've tried:

Merchant.all.each do |m|
  ActiveRecord::Base.connection.uncached do
   items = m.items
 end
end

and I've also tried adding this to my Merchant model:

def items_uncached
  self.class.uncached { items }
end

and then calling items_uncached instead, but I still end up racking up the memory usage with each new set of items I access.

I'm running Rails 2.3.10, Ruby 1.9.2 and using Mysql for storage.

Thanks in advance for your thoughts!

*** edit:

HEre's the actual bit of code I'm working on:

File.open(output, "w") do |f|
  Merchant.all.each do |m|
    items = m.items
    invalid_image_count = 0
    items.each do |i|
      invalid_image_count += 1 unless i.image_valid?
    end
    invalid_categories = items.select { |i| !i.categories_valid? }.count
    f.puts "#{m.name} (#{m.id}): #{invalid_image_count} invalid images, " +
            "#{invalid_categories} invalid categories"
  end
end

Trying to do some error checking and then logging the results.

Upvotes: 4

Views: 7120

Answers (2)

Casper
Casper

Reputation: 34328

The query cache is not the main problem here. Rails "caches" your objects anyway.

The query cache is simply a "hash lookup" that prevents Rails from hitting the DB unnecessarily, it does not control how ruby (or Rails) stores objects internally returned by associations.

For example try this (even if uncached):

m = Merhant.first # <- m is loaded from DB
m.items           # <- items are loaded from DB and STORED(!) in m
m.items           # <- items are returned from the association stored in m
m.items.reload    # <- hits the DB (or the query cache)
m.instance_variable_get("@items") # <- returns the actual stored items

So now when you do m.items in your each loop you simply populate all the Merhcant instances with all their items, and the garbage collector is unable to free anything since all the objects are referenced from the all array while you are inside the loop.

So the solution is to do like Victor proposes, which prevents the "association storage" from triggering.

Upvotes: 4

Victor Moroz
Victor Moroz

Reputation: 9225

If your association is a simple has_many one you can try this:

Merchant.all.each do |m| 
  items = Item.find_all_by_merchant_id(m.id) 
  ...
end 

Or even:

Merchant.find(:all, :select => "id, name").each do |m| 
  items = Item.find_all_by_merchant_id(m.id) 
  ... 
end

Upvotes: 3

Related Questions