Reputation: 21
I'm developing a search for entities by name in Postgres database. Name usually consists of 1-3 words and may contain symbols &, !, (, ), -, etc.
I'm using gin trigram index and queries: WHERE name ILIKE '%something%', ILIKE 'a%'
and WHERE name % 'abc'
for fuzzy search (if nothing was found by the exact match via ILIKE).
The problem is that we need to support search by any characters, not only letters and numbers. Trigram index ignores such characters.
I've tried text_pattern_ops
index for this case but without any success: queries such as WHERE name ILIKE '%$%'
are extremely slow
So, is there any way to efficiently process such queries? Do I need a full text search for this purpose?
UPD:
Table is like:
id (int) | Name (text) |
---|---|
123 | Dolce&Gabbana |
Queries are like:
SELECT name FROM brand WHERE name ILIKE '%&%' ORDER BY name;
UPD2:
Query plan for
EXPLAIN(analyze, verbose, buffers, settings)
SELECT name FROM brand
WHERE name ILIKE '%$%'
ORDER BY name
Index was created as:
CREATE INDEX brand_trgm_idx ON brand USING gin (name gin_trgm_ops);
Table was created as:
CREATE TABLE brand (
id serial PRIMARY KEY,
name TEXT,
collection_id TEXT,
created_at TIMESTAMP DEFAULT now() NOT NULL,
created_by TEXT NOT NULL
);
Also tried:
CREATE INDEX brand_name_idx ON brand (name text_pattern_ops);
UPD3:
Checked query analyze for the same db, but with ~1M entries:
Upvotes: 1
Views: 96
Reputation: 44157
trgm.h does #define KEEPONLYALNUM
. If you removed that and recompiled, it would then keep punctuation characters other than spaces. However, '%$%' doesn't have any usable trigrams in it, just like '%a%' doesn't, as it is too short. So the one concrete example you showed us would still not use the index.
It is also rather hazardous to do this, as then upgrading your system could cause your changed binaries to silently be lost. It would be better to fork and rename, but that is a lot of work.
Upvotes: 3