Faizan Saeed
Faizan Saeed

Reputation: 61

Postgres update from large table

I have a Postgres 12 table with more than 20 million rows (the table is partitioned on the customer_column_id) called A and I also have a table with 2.2 million rows called B. I want to update B by selecting data from A but it sometimes is taking 3 minutes and sometimes throwing an exception while reading from stream or the operation times out. The query which I am using is below.

UPDATE B 
  set "udd_erhvervsfaglig_forloeb" = A.value 
from A 
where A.customer_column_id = 60 
  and B.customer_id = A.customer_id 
  and coalesce(A.value, '') != ''

Query Plan

  ->  Nested Loop  (cost=0.43..15739311932.47 rows=6273088796225 width=85506)
        ->  Seq Scan on B(cost=0.00..3509632.93 rows=112625893 width=84980)
        ->  Index Scan using A_66_customer_id_idx1 on A_66 cev  (cost=0.43..0.46 rows=1 width=17)
              Index Cond: (customer_id = B.customer_id)
              Filter: (((COALESCE(value, ''::character varying))::text <> ''::text) AND (customer_column_id = 66))```

Upvotes: 1

Views: 225

Answers (1)

Yauheni Khvainitski
Yauheni Khvainitski

Reputation: 131

One of the ways of avoiding nested loops could be using EXISTS. So, your query can be written like this:

UPDATE B 
   SET "udd_erhvervsfaglig_forloeb" = A.value 
  FROM A 
 WHERE  
  EXISTS (SELECT 1
            FROM a
           WHERE B.customer_id = A.customer_id
             AND A.customer_column_id = 60
             AND coalesce(A.value, '') != '');

Upvotes: 0

Related Questions