Reputation: 21
I'm implementing import csv data into mysql in rails application. I have use CSV.parse to read line by line in csv file and import into database. This way works well.
But, when I deploy to Heroku server, timeout for each request is 30 seconds. If import csv file more than 30 seconds. Heroku server has error: request timeout - H12. Does anyone help me find out the best way to import large csv file? Now, I only import small csv include 70 users. I want import large csv include 500 - 1000 users. Here is the code:
Import controller:
CSV.foreach(params[:file].path, :headers => true) do |row|
i = i + 1
if i == 1
@company = Company.find_or_create_by!(name: row[0])
end
@users = User.find_by(email: row[1])
if @users
if @company.id == @users.employee.company_id
render :status=> 401, :json => {:message=> "Error"}
return
else
render :status=> 401, :json => {:message=> "Error"}
return
end
else
# User
# # Generate password
password = row[2]
user = User.new(email: row[1])
user.password = password.downcase
user.normal_password = password.downcase
user.skip_confirmation!
user.save!
obj = {
'small' => 'https://' + ENV['AWS_S3_BUCKET'] + '.s3.amazonaws.com/images/' + 'default-profile-pic_30x30.png',
'medium' => 'https://' + ENV['AWS_S3_BUCKET'] + '.s3.amazonaws.com/images/' + 'default-profile-pic_40x40.png'
}
employee = Employee.new(user_id: user.id)
employee.update_attributes(name: row[3], job_title: row[5], gender: row[9], job_location: row[10], group_name: row[11], is_admin: to_bool(row[13]),
is_manager: to_bool(row[14]), is_reviewee: to_bool(row[6]), admin_target: row[7], admin_view_target: row[12], department: row[8],
company_id: @company.id, avatar: obj.to_json)
employee.save!
end
end
I has try use gems 'activerecord-import' or 'fastercsv' but 'activerecord-import' not work, 'fastercsv' not work for ruby 2.0 and rails 4.0
Upvotes: 1
Views: 2475
Reputation: 482
It seems that these lines
if i == 1
@company = Company.find_or_create_by!(name: row[0])
end
@users = User.find_by(email: row[1])
takes a lot of computation cycle in your 30 seconds timeframe.
I would suggest to convert your routine into Heroku background process by using resque or delayed_job, or split the routine into n requests, if we cannot somewhat optimize the code above.
Hope this helps.
Upvotes: 0
Reputation: 10986
Process your CSV in the background, using products such as delayed_job
, sidekiq
, resque
. If it fits your usecase, you can even do this using guard
or cron
.
Upvotes: 0
Reputation: 6485
Doing this in a controller seems a bit much to me, especially since it's blocking. Have you given any thought to throwing it in a background job?
If I were you I'd:
Also, have a look at: https://github.com/tilo/smarter_csv
Upvotes: 2