Sid
Sid

Reputation: 603

Stop Ruby on Rails from Deleting Rack Multipart Files Created By an Upload

When a file is uploaded in Rails, it creates a rack multipart file in the /tmp folder.

RackMultipart20101109-31106-ylgoz0-0

After the request is completed, I use delayed_job to first process then upload this tmp file to Amazon S3.

The problem starts when rails (or rack) sporadically deletes these files when a fresh upload occurs.

My server is handling parallel uploads of files ranging from 1-1000 MB and quite often a file gets deleted before it is uploaded to S3.

Is there any way to stop rails (or rack) from deleting these files? Other solutions also welcome.

Upvotes: 3

Views: 2469

Answers (1)

Clinton
Clinton

Reputation: 3648

Just ran into the same problem, and the answer to this SO question gives a few clues. Most importantly:

As far as I've been able to tell or find, there is no physical file until an upload is read.

Initially I had code along the lines of:

# In my controller:
Delayed::Job.enqueue(FileJob.new(params[:id], params[:upload].path))

# And In lib/file_job.rb
class FileJob < Struct.new(:file_id, :log_file) 
  def perform
    File.open(log_file)
    # Do important stuff with the incoming file.
  end
end

So, if we have just shifted our file processing off into a delayed_job and another request comes in before our delayed_job has the chance to execute and read the file... Puff, our file appears to be obliterated before it has had a chance to be accessed and so no physical file is created.

My fix for this problem is along the following lines:

# In my controller:
FileUtils.copy_entry(params[:upload].path, params[:upload].path + "B")
Delayed::Job.enqueue(FileJob.new(params[:id], params[:upload].path + "B"))

# And In lib/file_job.rb
class FileJob < Struct.new(:file_id, :log_file) 
  def perform
    File.open(log_file)
    # Do important stuff with the incoming file.
    FileUtils.remove(log_file)
  end
end

I immediately copied the file in the controller, which blocks the method from another incoming request. And then I pass the new path into my delayed_job, which finally needs to clean the copied file after it has finished working with it.

This fix appears to be working well for me, but I guess the above solution won't work very well with extremely large files. I would love to better understand what is going on with rails and files not existing until they are read.

Upvotes: 2

Related Questions