Reputation: 697
I would like to use postgresql's unaccent contrib package in my tsvector column. I have successfully installed the package using create extension unaccent, and have checked that it works. I had set up a trigger to automatically update the tsvector column,
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON artists FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(tsv_name, 'pg_catalog.simple', name);
and then set the tsv_name column on all existing rows using the following
UPDATE artists SET tsv_name = (to_tsvector('simple', coalesce(name, '')));
I thought that I'd be able to incorporate the unaccent package by simply replacing name by unaccent(name) in the above code, but that's throwing a syntax error in the create trigger code. How can I fix this to ensure that the tsvector column uses unaccented characters?
Upvotes: 1
Views: 1429
Reputation: 86
I know this is a really old ticket, but along with @ErwinBrandstetter 's incredible answer (https://stackoverflow.com/a/11007216/5261969) and some PostgreSQL doc reading (https://www.postgresql.org/docs/9.3/functions-textsearch.html), I finally figured this out.
This is what my migration looks like (using Rails 5.2):
# Erwin's wrapper code
connection.execute(<<-EOSQL)
CREATE OR REPLACE FUNCTION public.immutable_unaccent(regdictionary, text)
RETURNS text LANGUAGE c IMMUTABLE STRICT AS
'$libdir/unaccent', 'unaccent_dict';
EOSQL
connection.execute(<<-EOSQL)
CREATE OR REPLACE FUNCTION public.f_unaccent(text)
RETURNS text LANGUAGE sql IMMUTABLE STRICT AS
$func$
SELECT public.immutable_unaccent(regdictionary 'unaccent', $1)
$func$;
EOSQL
# new trigger using f_unaccent
connection.execute(<<-EOSQL)
CREATE TRIGGER artists_name_tsvector_update BEFORE INSERT OR UPDATE
OF name ON artists FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(
tsv_name, 'public.f_unaccent(name)', name
);
UPDATE artists SET tsv_name = (to_tsvector('public.f_unaccent(name)', name));
EOSQL
This update allows users to search with special characters or not to yield the same results (e.g "Ønders" and "Onders" search query both return the correct artist.)
Note: I am also using I18n.transliterate
to localize the incoming query param.
Updated - upon running tests, I found that my UPDATE code was incorrect, so I had to alter my code to get it working. I'll keep the above in case it helps someone
Updated migration using @EvanCarroll 's answer https://stackoverflow.com/a/50595181/5261969 - I couldn't figure out how to get the update trigger to stop complaining using Erwin's code. :/
# create extension unaccent
connection.execute(<<-EOSQL)
CREATE TEXT SEARCH CONFIGURATION f_unaccent ( COPY = simple );
ALTER TEXT SEARCH CONFIGURATION f_unaccent
ALTER MAPPING FOR hword, hword_part, word
WITH unaccent, simple;
EOSQL
# create updated trigger using unaccent for tsv_name
# update existing tsv_names
connection.execute(<<-EOSQL)
CREATE TRIGGER artists_name_tsvector_update BEFORE INSERT OR UPDATE
OF name ON artists FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(
tsv_name, 'public.f_unaccent', name
);
UPDATE artists SET tsv_name = (to_tsvector('public.f_unaccent', name));
EOSQL
Upvotes: 5