Luffydude
Luffydude

Reputation: 812

Concatenated index in postgresql

So basically I'm matching addresses by matching strings within 2 tables

Table B has 5m rows so I really don't want to create new columns for it every time I want to match the addresses

So I thought about creating indexes instead, my current index to match addresses would look like:

CREATE INDEX matchingcol_idx  ON tableB USING btree (sub_building_name || ', ' || building_name )

However this does not work, it doesn't accept the concatenation bar

My update query would then equal = b.sub_building_name || ', ' || b.building_name

Without a new column and an index this would take multiple hours

Is there a way to achieve this without creating new concatenation columns?

Upvotes: 9

Views: 10136

Answers (2)

Andrei Sura
Andrei Sura

Reputation: 2604

For matching via LIKE or ILIKE you can create a gin index as described here https://niallburkley.com/blog/index-columns-for-like-in-postgres/

CREATE INDEX CONCURRENTLY idx_match_building 
   ON tableB USING gin ( (sub_building_name || ', ' || building_name) gin_trgm_ops);

Verify that the extension is enabled via this query

SELECT * FROM pg_extension where extname = 'pg_trgm';
+-------+---------+----------+--------------+----------------+------------+-----------+--------------+
| oid   | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition |
|-------+---------+----------+--------------+----------------+------------+-----------+--------------|
| 17458 | pg_trgm | 10       | 17455        | True           | 1.3        | <null>    | <null>       |
+-------+---------+----------+--------------+----------------+------------+-----------+--------------+

Relevant info about why concat will not cut it :) https://www.postgresql.org/message-id/1409893231867-5817884.post%40n5.nabble.com

Upvotes: 1

user330315
user330315

Reputation:

For an expression based index, you need to put the expression between parentheses:

CREATE INDEX matchingcol_idx  
   ON tableB USING btree ( (sub_building_name || ', ' || building_name) );

But that index will only be used if you use exactly the same condition in your where clause. Any condition only referencing one of the columns will not use that index.

Upvotes: 15

Related Questions