lifecoder
lifecoder

Reputation: 1434

Rails file uploading and Heroku

I have a problem with file uploading on Heroku server. (Also a hint about the right way of doing such type of things with rails would be greatly appreciated - I'm very new in RoR).

All this code is about uploading some CSV file, then allowing user to tweak couple of settings, and parse file after all. This usually work on localhost (several times I get troubles with value stored in session), but on Heroku it is always die on upload.

In one of the neighborhood questions was written that Heroku store file only during singe instance run, but I still couldn't find anything about this in Heroku's docs. Should I store file data in the db right after upload, so in such case it always be available? The downside - files could be pretty big, about 10-20Mb, it isn't looks nice.

Heroku logs say:

 2012-05-21T19:27:20+00:00 app[web.1]: Started POST "/products/upload" for 46.119
.175.140 at 2012-05-21 19:27:20 +0000
2012-05-21T19:27:20+00:00 app[web.1]:   Processing by ProductsController#upload
as HTML
2012-05-21T19:27:20+00:00 app[web.1]:   Parameters: {"utf8"=>"тЬУ", "authenticit
y_token"=>"aqJFg3aqENfxS2lKCE4o4txxkZTJgPx36SZ7r3nyZBw=", "upload"=>{"my_file"=>
#<ActionDispatch::Http::UploadedFile:0x000000053af020 @original_filename="marina
-AutoPalmaPriceList_2011-07-30.txt", @content_type="text/plain", @headers="Conte
nt-Disposition: form-data; name=\"upload[my_file]\"; filename=\"marina-AutoPalma
PriceList_2011-07-30.txt\"\r\nContent-Type: text/plain\r\n", @tempfile=#<File:/t
mp/RackMultipart20120521-1-10g8xmx>>}, "commit"=>"Upload"}
2012-05-21T19:27:20+00:00 app[web.1]:
2012-05-21T19:27:20+00:00 app[web.1]: LoadError (no such file to load -- CSV):
2012-05-21T19:27:20+00:00 app[web.1]:   app/controllers/products_controller.rb:8
2:in `upload'
2012-05-21T19:27:20+00:00 app[web.1]:
2012-05-21T19:27:20+00:00 app[web.1]:
2012-05-21T19:27:20+00:00 app[web.1]: cache: [POST /products/upload] invalidate,
 pass

The code itself:

ProductsController:

def import
  respond_to do |format|
    format.html
  end
end

def import_adjust
  case params[:commit]
    when "Adjust"
      @col_default = params[:col_data]
      #abort @col_default.to_yaml
      #update csv reader with form data, restore filters  from params
    when "Complete"
      #all ok, read the whole file
      #abort params.to_yaml
      redirect_to import_complete
    else
      @col_default = nil
  end
  #read first part of the file
  @tmp = session[:import_file]
  @csv = []
  source = CSV.open @tmp, {col_sep: ";"}

  5.times do
    line = source.readline
    if line.size>0
      @line_size = line.size
      @csv.push line
    end
  end

  #generate a selection array
  #selection = select_tag 'col_data[]', options_for_select([['name','name'], ['brand','brand'], ['delivery_time','delivery_time'], ['price','price']])
  #@csv = [selection * line_size] + @csv
end

def import_complete
  #remove all items
  #todo check products with line items will not be destroyed.
  Product.destroy_all
  #abort params.to_yaml
  map = {}
  cnt = 0
  #todo check for params count.
  params[:col_data].each do |val|
    map[cnt] = val if val != 'ignore'
    cnt += 1
  end

  source = CSV.open session[:import_file], {col_sep: ';'}
  source.each do |row|
    cnt += 1
    if row.size > 0
      item = Product.new
      map.each do |col, attr|
        item[attr] = row[col]
      end
      item[:provider_id] = params[:adjust][:provider]
      item.save
      #abort item.to_yaml
    end
  end

  #abort map.to_yaml
  #todo response needed.
end

def upload
  require 'CSV' #looks like I dont need this in fact.
  @tmp = params[:upload][:my_file].path #tempfile

  @csv = []
  #source = CSV.open @tmp, {col_sep: ";"}

  session[:import_file] = params[:upload][:my_file].path

  respond_to do |format|
    format.html { redirect_to action: 'import_adjust' }
  end
end

upload.html.erb:

<h1>Uploaded</h1>
<%= @tmp %>

<% @csv.each do |val| %>
    <%= val %>
<% end %>

_form_import.html.erb:

<%= form_for :upload, :html => {:multipart => true}, :url => {action: "upload"} do |f| %>
<%= f.file_field :my_file %>
<%= f.submit "Upload" %>
<% end %>

import_adjust.html.erb:

<h1>New product</h1>


<%= form_for :adjust, :url => {action: "import_adjust"} do |f| %>
<% if @csv %>
<table>
  <tr>
    <% @line_size.times do |cnt| %>
        <td>
        <%= select_tag 'col_data[]',
               options_for_select([
                  ['--ignore--', 'ignore'],
                  ['name','name'],
                  ['brand','brand'],
                  ['delivery_time','delivery_time'],
                  ['price','price']
        ], @col_default!=nil ? @col_default[cnt] : nil) %>
        </td>
    <% end %>
  </tr>

  <% @csv.each do |val| %>
     <tr>
     <% val.each do |cell| %>
       <td>
         <%= cell %>
       </td>
     <% end %>
     </tr>
  <% end %>
</table>
<% end %>


  <%= f.label :delimiter, 'Разделитель' %>
  <%= f.text_field :delimiter %>
  <br>
  <%= f.label :provider, 'Поставщик' %>
  <%#todo default empty option needed! Human mistakes warning! %>
  <%= f.select :provider, Provider.all.collect { |item| [item.name, item.id] } %>
  <br>
  <%= f.label :delimiter, 'Разделитель' %>
  <%= f.text_field :delimiter %>
  <br>
  <%# Adjust for proceed adjusting or Complete  for parsing %>
  <%= f.submit "Adjust" %>
  <%= f.submit "Complete" %>
<% end %>


<%= link_to 'Back', products_path %>

Upvotes: 0

Views: 1916

Answers (2)

Dawn Green
Dawn Green

Reputation: 493

I have an identical scenario as Lifecoder where a user uploads a file, names the columns using a map_fields plugin (by Andrew Timberlake), and then the file is parsed and processed. Here's how I handle it:

  file_field = params[options[:file_field]]
  map_fields_file_name = "map_fields_#{Time.now.to_i}_#{$$}"

  bucket = S3.buckets[CSV_COUPON_BUCKET_NAME]    # gets an existing bucket
  obj = bucket.objects[map_fields_file_name]
  obj.write( file_field.read )

  # Save the name and bucket to retrieve on second pass
  session[:map_fields][:bucket_name] = map_fields_file_name

Then on the second pass to process the file, I open the file and read it back into temp for the dyno to process:

    # Get CSV data out of bucket and stick it back into temp, so we pick up where
    # we left off as far as map_fields is concerned.
    bucket = S3.buckets[CSV_COUPON_BUCKET_NAME]
    obj = bucket.objects[session[:map_fields][:bucket_name]]
    temp_path = File.join(Dir::tmpdir, "map_fields_#{Time.now.to_i}_#{$$}")
    File.open(temp_path, 'wb') do |f|
      f.write obj.read
    end

I had to use the plugin so I could modify the code, as obviously the gem is handled by Heroku and doesn't allow for modifications.

Upvotes: 0

Glenn Gillen
Glenn Gillen

Reputation: 471

Could you paste the entire controller code? The problem is on line #82, but I can't be 100% confident what line that is if you've stripped the class def and before_filters out.

That said, it looks like the problem is with one of the CSV.open lines. The way you're trying to set session[:import_file] is not guaranteed to work. If you ever run the app on more than one dyno you could have the first request served by your web.1 dyno and the second served by web.2, and they have different file systems and would not be able to see the same temp files.

I'd suggest one of the following:

  • Do all the processing immediately on the upload and avoid the re-direct.
  • An improvement on that would be to have the upload store the data somewhere shared and accessible (the database or S3) and have start a background job/process to do the processing.
  • Best of all would be to upload directly to S3 (I believe the S3 Uploader library can do this, there are probably others) and issue a callback to create a background job to process.

That last option means your web dynos are never tied up handling massive uploads and you don't burden the user with waiting for the latency involved in upload to server->store in S3->schedule background job, it is reduced simply to store in S3 from their perspective.

Upvotes: 2

Related Questions