Reputation: 4038
How can I use update_all
, if I want to update a column of 300,000 records all with a variety of different values?
What I want to do is something like:
Model.update_all(:column => [2,33,94,32]).where(:id => [22974,22975,22976,22977])
But unfortunately this doesn't work, and it's even worse for 300,000 entries.
Upvotes: 22
Views: 22771
Reputation: 5918
This is my 2020 answer:
The most upvoted answer is wrong; as the author himself states, it will trigger n
SQL queries, one for each row.
The second most upvoted answer suggests gem "activerecord-import", which is the way to go. However, it does so by instantiating ActiveRecord models, and if you are in business for a gem like this, you're probably looking for extreme performance (it was our case anyways).
So this is what we did. First, you build an array of hashes, each hash containing the id
of the record you want to update and any other fields.
For instance:
records = [{ id: 1, name: 'Bob' }, { id: 2, name: 'Wilson' },...]
Then you invoke the gem like this:
YourModelName.import(records, on_duplicate_key_update: [:name, :other_columns_whose_keys_are_present_in_the_hash], validate: false, timestamps: false)
Explanation:
on_duplicate_key_update
means that, if the database finds a collision on primary key (and it will on every row, since we're talking about updating existing records), it will NOT fail, and instead update the columns you pass on that array.
If you don't validate false
(default is true), it will try to instantiate a new model instance for each row, and probably fail due to validation (since your hashes only contain partial information).
timestamp false
is also optional, but good to know it's there.
Upvotes: 7
Reputation: 402
The only way I found to do it is to generate INSERT INTO request with updated values. I'm using gem "activerecord-import" for that.
For example, I have a table with val values
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
| pkey | id | site_id | feature_id | val | created_at | updated_at |
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
| 1 | | 125 | 7 | 88 | 2016-01-27 10:25:45 UTC | 2016-02-05 11:18:14 UTC |
| 111765 | 0001-0000024 | 125 | 7 | 86 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:18:14 UTC |
| 111766 | 0001-0000062 | 125 | 7 | 15 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:18:14 UTC |
| 111767 | 0001-0000079 | 125 | 7 | 19 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:18:14 UTC |
| 111768 | 0001-0000086 | 125 | 7 | 33 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:18:14 UTC |
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
products = CustomProduct.limit(5)
products.each_with_index{|p, i| p.val = i}
CustomProduct.import products.to_a, :on_duplicate_key_update => [:val]
All you records will be updated in single request. Please find out gem "activerecord-import" documentation for more details.
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
| pkey | id | site_id | feature_id | val | created_at | updated_at |
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
| 1 | | 125 | 7 | 0 | 2016-01-27 10:25:45 UTC | 2016-02-05 11:19:49 UTC |
| 111765 | 0001-0000024 | 125 | 7 | 1 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:19:49 UTC |
| 111766 | 0001-0000062 | 125 | 7 | 2 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:19:49 UTC |
| 111767 | 0001-0000079 | 125 | 7 | 3 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:19:49 UTC |
| 111768 | 0001-0000086 | 125 | 7 | 4 | 2016-01-27 11:33:22 UTC | 2016-02-05 11:19:49 UTC |
+--------+--------------+---------+------------+-----+-------------------------+-------------------------+
Upvotes: 12
Reputation: 1234
From the ActiveRecord#update documentation:
people = { 1 => { "first_name" => "David" }, 2 => { "first_name" => "Jeremy" } }
Person.update(people.keys, people.values)
So in your case:
updates = {22974 => {column: 2}, 22975 => {column: 33}, 22976 => {column: 94}, 22977 => {column: 32}}
Model.update(updates.keys, updates.values)
Edit: Just had a look at the source, and this is generating n SQL queries too... So probably not the best solution
Upvotes: 17
Reputation: 3796
the short answer to your question is, you can't.
The point of update_all is to assign the same value to the column for all records (matching the condition if provided). The reason that is useful is that it does it in a single SQL statement.
I agree with Shime's answer for correctness. Although that will generate n SQL calls. So, maybe there is something more to your problem you're not telling us. Perhaps you can iterate over each possible value, calling update_all for the objects that should get updated with that value. Then it's a matter of either building the appropriate hash, or, even better, if the condition is based on something in the Model itself, you can pass the condition to update_all.
Upvotes: 6