Jeriko
Jeriko

Reputation: 6637

Rails: Working with a large set

I have some rake scripts that operate on collections of hundreds of thousands of items.

Often, my server runs out of memory and the script crashes. I assume that this is because my code looks like this:

Asset.where(:archived => false).each { |asset| asset.action! }

As far as I can tell, Rails fetches the entire set into memory, and then iterates through each instance.

My server doesn't seem to be happy loading 300,000 instances of Asset at once, so in order to reduce the memory requirements I've had to resort to something like this:

collection = Asset.where(:archived => false) # ActiveRecord::Relation
while collection.count > 0
  collection.limit(1000).each { |asset| asset.action! }
end

Unfortunately, that doesn't seem very clean. It gets even worse when the action doesn't remove the items from the set, and I have to keep track with offsets too. Does anyone have suggestions as to a better way of partitioning the data or holding onto the relation longer, and only loading rows as necessary?

Upvotes: 1

Views: 232

Answers (1)

Jesse Wolgamott
Jesse Wolgamott

Reputation: 40277

The find_each method is designed to help in these situations. It'll

Asset.where(:archived => false).find_each(:batch_size=>500) do |asset|
  asset.stuff
end

by default, the batch size is 1000

Upvotes: 2

Related Questions