Nin
Nin

Reputation: 21

How to import large csv file into mysql in rails application?

I'm implementing import csv data into mysql in rails application. I have use CSV.parse to read line by line in csv file and import into database. This way works well.


But, when I deploy to Heroku server, timeout for each request is 30 seconds. If import csv file more than 30 seconds. Heroku server has error: request timeout - H12. Does anyone help me find out the best way to import large csv file? Now, I only import small csv include 70 users. I want import large csv include 500 - 1000 users. Here is the code:

Import controller:

CSV.foreach(params[:file].path, :headers => true) do |row|
  i = i + 1

  if i == 1
    @company = Company.find_or_create_by!(name: row[0])       
  end

  @users = User.find_by(email: row[1])

  if @users
    if @company.id == @users.employee.company_id
      render :status=> 401, :json => {:message=> "Error"}
      return
    else
      render :status=> 401, :json => {:message=> "Error"}
      return
    end
  else
    # User
    # # Generate password
    password = row[2]
    user = User.new(email: row[1])
    user.password = password.downcase
    user.normal_password = password.downcase
    user.skip_confirmation!
    user.save!

    obj = {
      'small'   => 'https://' + ENV['AWS_S3_BUCKET'] + '.s3.amazonaws.com/images/' + 'default-profile-pic_30x30.png',
      'medium'  => 'https://' + ENV['AWS_S3_BUCKET'] + '.s3.amazonaws.com/images/' + 'default-profile-pic_40x40.png'
    }

    employee = Employee.new(user_id: user.id)
    employee.update_attributes(name: row[3], job_title: row[5], gender: row[9], job_location: row[10], group_name: row[11], is_admin: to_bool(row[13]), 
                is_manager: to_bool(row[14]), is_reviewee: to_bool(row[6]), admin_target: row[7], admin_view_target: row[12], department: row[8], 
                company_id: @company.id, avatar: obj.to_json)
    employee.save!

  end
end

I has try use gems 'activerecord-import' or 'fastercsv' but 'activerecord-import' not work, 'fastercsv' not work for ruby 2.0 and rails 4.0

Upvotes: 1

Views: 2475

Answers (3)

Ismail Faruqi
Ismail Faruqi

Reputation: 482

It seems that these lines

if i == 1
  @company = Company.find_or_create_by!(name: row[0])       
end

@users = User.find_by(email: row[1])

takes a lot of computation cycle in your 30 seconds timeframe.

I would suggest to convert your routine into Heroku background process by using resque or delayed_job, or split the routine into n requests, if we cannot somewhat optimize the code above.

Hope this helps.

Upvotes: 0

Litmus
Litmus

Reputation: 10986

Process your CSV in the background, using products such as delayed_job, sidekiq, resque. If it fits your usecase, you can even do this using guard or cron.

Upvotes: 0

AdamT
AdamT

Reputation: 6485

Doing this in a controller seems a bit much to me, especially since it's blocking. Have you given any thought to throwing it in a background job?

If I were you I'd:

  1. Upload the file
  2. Parse it in the background as a rake task

Also, have a look at: https://github.com/tilo/smarter_csv

Upvotes: 2

Related Questions