Reputation: 8972
I have to do a bit complicated data import. I need to do a number of UPDATEs which currently updating over 3 million rows in one query. This query is applying about 30-45 sec each (some of them even 4-5 minutes). My question is, whether I can speed it up. Where can I read something about it, what kind of indexes and on which columns I can set to improve those updates. I don't need exacly answer, so I don't show the tables. I am looking for some stuff to learn about it.
Upvotes: 1
Views: 9580
Reputation: 10246
Two things:
1) Post an EXPLAIN ANALYZE
of your UPDATE
query.
2) If your UPDATE
does not need to be atomic, then you may want to consider breaking apart the number of rows affected by your UPDATE
. To minimize the number of "lost rows" due to exceeding the Free Space Map, consider the following approach:
BEGIN
UPDATE ... LIMIT N;
or some predicate that would limit the number of rows (e.g. WHERE username ilike 'a%';
).COMMIT
VACUUM table_being_updated
ANALYZE table_being_updated
I suspect you're updating every row in your table and don't need all rows to be visible with the new value at the end of a single transaction, therefore the above approach of breaking the UPDATE
up in to smaller transactions will be a good approach.
And yes, an INDEX
on the relevant columns specified in the UPDATE
's predicate will help will dramatically help. Again, post an EXPLAIN ANALYZE
if you need further assistance.
Upvotes: 8
Reputation: 125494
If by a number of UPDATEs you mean one UPDATE command to each updated row then the problem is that all the target table's indexes will be updated and all constraints will be checked at each updated row. If that is the case then try instead to update all rows with a single UPDATE:
update t
set a = t2.b
from t2
where t.id = t2.id
If the imported data is in a text file then insert it in a temp table first and update from there. See my answer here
Upvotes: 0