Reputation: 712
I added two styles (smallcard & mediumcard) to my paperclip attachment model Screenshot :
class Screenshot < ActiveRecord::Base
has_attached_file :image,
:styles => { :tiny => "x75", :small => "x245", :medium => "x480", :large => "1280x900>",
:smallcard => "280x245#", :mediumcard => "570x480#" },
:storage => :s3,
:s3_credentials => "#{Rails.root}/config/amazon_s3.yml",
:path => "/screenshots/:id_partition/:style/:filename"
end
I hand created a public/system/paperclip_attachments.yml file to reduce processing of pre-existing styles:
---
:Screenshot:
:image:
- :tiny
- :small
- :medium
- :large
But still when I run rake paperclip:refresh:missing_styles CLASS=Screenshot I get the following:
Regenerating Screenshot -> image -> [:mediumcard, :smallcard]
rake aborted!
Cannot allocate memory - identify -format %wx%h '/tmp/79a229e96ab52dfa760132958da47bf320120806-31260-1eleoww[0]'
Tasks: TOP => paperclip:refresh:missing_styles
[clip]
When I tail the logs, processing only gets up into the 500s (ids).
The server is admittedly a Linode 512 running Ubuntu and it's been rock solid at serving pages for 3 Rails apps and 1 PHP app for years. I've never run out of memory on it before.
Monitoring the rake task process, it incrementally grows with each processed image until it eats up all available RAM.
Maybe it's time for my Linode to grow..but first I'm hoping for some other options.
How can I get around this memory issue and add these two styles to the pre-existing 13k images?
Thanks for your help!
Upvotes: 0
Views: 885
Reputation: 712
Hopefully this can help someone else having the same issue.
As Chris suggested, I wrapped one rake task inside of another which is called using %x(). Each iteration fully releases the memory from the previous call.
namespace :screenshots do
desc "Incrementally rebuild thumbnails. START=0 & BATCH_SIZE=10 & VERBOSE=false"
task :reprocess_stepper => :environment do
batch_size = (ENV['BATCH_SIZE'] || ENV['batch_size'] || 10)
verbose = (ENV['VERBOSE'] || ENV['verbose'] || nil)
total = Screenshot.count
start = 0
while start < total
puts "Spawning: bundle exec rake screenshots:reprocess_some START=#{start} BATCH_SIZE=#{batch_size} VERBOSE=#{verbose} RAILS_ENV=#{Rails.env}"
puts %x{bundle exec rake screenshots:reprocess_some START=#{start} BATCH_SIZE=#{batch_size} VERBOSE=#{verbose} RAILS_ENV=#{Rails.env} }
start = start + batch_size.to_i
end
end
desc "Reprocess a batch of screenshots. START=0 & BATCH_SIZE=10 & VERBOSE=false"
task :reprocess_some => :environment do
start = (ENV['START'] || ENV['start'] || 0)
batch_size = (ENV['BATCH_SIZE'] || ENV['batch_size'] || 10)
verbose = (ENV['VERBOSE'] || ENV['verbose'] || nil)
puts "start = #{start} & batch_size = #{batch_size}" if verbose
puts "RAILS_ENV=#{Rails.env}" if verbose
screenshots = Screenshot.order("id ASC").offset(start).limit(batch_size).all
screenshots.each do |ss|
puts "Re-processing paperclip image on screenshot ID: #{ss.id}" if verbose
STDOUT.flush
ss.image.reprocess!
end
end
end
You can then call this task as follows:
RAILS_ENV=production bundle exec rake screenshots:reprocess_stepper VERBOSE=true BATCH_SIZE=50
Upvotes: 0
Reputation: 342
You need to give your system a chance to free the memory properly. A bold trick we used when confronted with a similar problem using an ORM for a PHP batch task is this: wrap your task in another task which calls the first task only for one item at a time. In general, you should refactor the first task to take an argument for the base image. The second task should gather all images (in a memory-friendly way, e.g. object ids or something like that) and then loop through them and call the first task with each as argument. When the first task ist completed it will return the memory to the os which can then free the memory. The second or wrapper task on the other hand never needs as much memory at once. In this way, maximum memory usage should be the maximum for processing one image and not all images.
Upvotes: 2