Reputation: 7775
Let's assume that I have a table with two columns A and B. There is an index on column A but not on column B. I want to issue several million queries like:
UPDATE t1 SET b=b1 WHERE a=a1;
UPDATE t1 SET b=b2 WHERE a=a2;
....
There are anywhere from 1 to 100,000 rows corresponding to each unique value of a
. On average it's around 100.
For each update statement on average 60% of the rows will not be changed because for these rows b
already has the desired value. For 30% of the updates none of the matching rows will be changed.
Does it make sense to use statements like these?
UPDATE t1 SET b=b1 WHERE a=a1 AND b<>b1;
Will it speedup the process by eliminating unneeded writebacks to disk or is Mysql 5 smart enough to recognize that nothing is being changed and there is no need to write back to disk?
Upvotes: 5
Views: 760
Reputation: 9926
In either case, MySQL will have to read the row content (whether it is on disk or in a cache/bufferpool). In either case, MySQL will use your index on a
as a starting point. In either case, MySQL will not update the row if it already has the destination value for b
. Consequently, I don't see any way in which MySQL would benefit from having the b<>b1
clause.
Arguably, depending on the workload and dataset, the query (with b<>b1
in it) might benefit if you change your index on a
to be a compound index on a
and b
(in that order). In this case, it wouldn't have to hit disk (or check the cache/buffer pool) to find which rows specifically require updating (i.e. you'd capitalize on the 30% and 60% you mentioned). Having said that, now your index will require an update for each update on b
, so there is a cost, although I suspect the trade-off might be worth it.
Upvotes: 4
Reputation: 47321
I been thinking to combine multiple updates into single update using CASE
update t1
set b=
case a
when a=a1 then b1
when a=a2 then b2
when a=a3 then b3 ...
end;
hope this useful, and if it terrible slow, please include explaination
Upvotes: 0
Reputation: 12721
You should add the extra filter. Mysql is smart enough to not update a value if it is the same, but it's better to eliminate this check. You can confirm this by seeing how many rows were "affected" by the query.
Upvotes: 0