Reputation: 7099
I have a CSV file that contains data like the id of user, unit and size.
I want to update member_id
for 500,000 products:
500000.times do |i|
user = User.find(id: tmp[i])
hash = {
unit: tmp[UNIT],
size: tmp[SIZE]
}
hash.merge!(user_id: user.id) if user.present?
Product.create(hash)
end
How do I optimize that procedure to not find each User object but maybe get an array of related hashes?
Upvotes: 0
Views: 85
Reputation: 211540
There's two things here that are massively holding back performance. First you're doing N User.find
calls which is totally out of control. Secondly you're creating individual records instead of doing a mass-insert each of which runs inside its own tiny transaction block.
Generally these sorts of bulk operations are better done purely in the SQL domain. You can insert a very large number of rows at the same time, often only limited by the size of the query you can submit, and that parameter is usually adjustable.
While a gigantic query may lock or block your database for a period of time, it will be the fastest way to do your updates. If you need to keep your system running during mass inserts, you'll need to break it up into a series of smaller commits.
Remember that Product.connection
is a more low-level access layer allowing you to manipulate the data directly with queries.
Upvotes: 4