Reputation: 4498
I'm using Postgres DB and I have a table called MyObjects
with several varchar columns. The values will not be separated words (for example, urls, names etc..). I will be filtering according to these columns a lot. Searching rows that a certain word will correspond in the beginning, middle or ending of the value.
The table itself will not be large (the number of rows in the beginning will be around n*10^3 but it will grow to be around n*10^5 rows)
Is there even any need for an index in this case? If so - what kind of index do you suggest?
Thanks!
Upvotes: 4
Views: 11244
Reputation: 2624
From PostgreSQL documentation:
So, BTree is out, because you want to match to middle and end. Now, looking at this page:
PostgreSQL: More performance for LIKE and ILIKE statements
It seems Gin is the way to go. You should activate pg_trgm as mentioned by @Laurenz Albe:
CREATE EXTENSION pg_trgm;
The query should look something like this:
CREATE INDEX idx_gin ON MyObjects USING gin (column_name gin_trgm_ops);
Hope this helps. Please read the cybertec post thoroughly for detailed info.
To answer your other question, (Is there even need for an index) my observation is that if you grow to 10^5, yes, you might very well need an index. It's not that 10^5 is too large, it's that you might want to do many searches on that dataset. I've had a 180000 row table (Pretty small right?) but, for one process I was searching 20000 records in there. With a simple BTree, I got the query exec time from 271 seconds down to 343 ms. This was on SQLite. So, yeah, I'd go with an index.
Upvotes: 4
Reputation: 246383
If you are looking for an index that can help with search patterns as WHERE col LIKE '%string%'
, the only option is a GIN or GiST index using pg_tgrm
.
These indexes can become quite large, and they don't help a lot if you search for short substrings.
To figure out if such an index would bring benefits over a sequential table scan in your case, you'd have to try it out – it cannot be answered in general.
Upvotes: 7