hackg
hackg

Reputation: 95

Update From Table - not using index (pg 9.1)

UPDATE
 npi_bn
SET
  count = npi_bn_count.count
FROM 
  npi_bn_count
WHERE
  npi_bn.bn = npi_bn_count.bn

PROBLEM: This update query is not using my indexes (incredibly slow on 20 mil rows). Its fairly straightforward text join and both fields in both tables are indexed. How can I get the planner to use my indexes?

EXPLAIN
Update on npi_bn  (cost=99.59..1045037.73 rows=20826256 width=75)
  ->  Hash Join  (cost=99.59..1045037.73 rows=20826256 width=75)
        Hash Cond: (npi_bn.bn = npi_bn_count.bn)
        ->  Seq Scan on npi_bn  (cost=0.00..706474.20 rows=20832220 width=65)
        ->  Hash  (cost=55.93..55.93 rows=3493 width=22)
              ->  Seq Scan on npi_bn_count  (cost=0.00..55.93 rows=3493 width=22)

HAVE TRIED: I have rebuilt the indexes several times, with/without text_pattern_ops and ran analyze

INFO: Postgres 9.1 (64-bit) I have a table with text values appearing multiple times, which I counted the values (how many times they appear) and stored in another table. I want to update the main table with the count values, so every time you see the value "abc" in the count column, it lists the number of times it appears in the table

INDEXES:

CREATE INDEX idx_npi_bn_name
  ON npi_bn
  USING btree (bn text_pattern_ops);

CREATE INDEX idx_npi_bn_count_name
  ON npi_bn_count
  USING btree (bn text_pattern_ops);

EXAMPLE DATA:

╔════╦══════════════╦══════╗
║ ID ║  bn (text)   ║ count║
╠════╬══════════════╬══════╣
║  1 ║ abc          ║ 2    ║
║  2 ║ efg          ║ 1    ║
║  3 ║ abc          ║ 2    ║
║  4 ║ xyz          ║ 1    ║
╚════╩══════════════╩══════╝

MAIN TABLE = npi_bn

COUNTS STORED IN TABLE = npi_bn_count

Upvotes: 2

Views: 84

Answers (1)

Craig Ringer
Craig Ringer

Reputation: 325081

There is no evidence that an index would help this query out.

You are reading and processing all rows. A hash join or merge join will probably be faster.

If you want to compare, try (for testing purposes) setting:

enable_hashjoin = off
enable_mergejoin = off

It'll probably then use your index(es) if they're suitable ... and be even slower.

If it's faster then your random_page_cost probably doesn't reflect the machine's real performance and should be much lower.

Upvotes: 3

Related Questions