sennin
sennin

Reputation: 8972

Postgres optimize UPDATE

I have to do a bit complicated data import. I need to do a number of UPDATEs which currently updating over 3 million rows in one query. This query is applying about 30-45 sec each (some of them even 4-5 minutes). My question is, whether I can speed it up. Where can I read something about it, what kind of indexes and on which columns I can set to improve those updates. I don't need exacly answer, so I don't show the tables. I am looking for some stuff to learn about it.

Upvotes: 1

Views: 9580

Answers (2)

Sean
Sean

Reputation: 10246

Two things:

1) Post an EXPLAIN ANALYZE of your UPDATE query.

2) If your UPDATE does not need to be atomic, then you may want to consider breaking apart the number of rows affected by your UPDATE. To minimize the number of "lost rows" due to exceeding the Free Space Map, consider the following approach:

  1. BEGIN
  2. UPDATE ... LIMIT N; or some predicate that would limit the number of rows (e.g. WHERE username ilike 'a%';).
  3. COMMIT
  4. VACUUM table_being_updated
  5. Repeat steps 1-4 until all rows are updated.
  6. ANALYZE table_being_updated

I suspect you're updating every row in your table and don't need all rows to be visible with the new value at the end of a single transaction, therefore the above approach of breaking the UPDATE up in to smaller transactions will be a good approach.

And yes, an INDEX on the relevant columns specified in the UPDATE's predicate will help will dramatically help. Again, post an EXPLAIN ANALYZE if you need further assistance.

Upvotes: 8

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125494

If by a number of UPDATEs you mean one UPDATE command to each updated row then the problem is that all the target table's indexes will be updated and all constraints will be checked at each updated row. If that is the case then try instead to update all rows with a single UPDATE:

update t
set a = t2.b 
from t2
where t.id = t2.id

If the imported data is in a text file then insert it in a temp table first and update from there. See my answer here

Upvotes: 0

Related Questions