David Harbage
David Harbage

Reputation: 697

Using unaccent in tsvector trigger

I would like to use postgresql's unaccent contrib package in my tsvector column. I have successfully installed the package using create extension unaccent, and have checked that it works. I had set up a trigger to automatically update the tsvector column,

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON artists FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(tsv_name, 'pg_catalog.simple', name);

and then set the tsv_name column on all existing rows using the following

UPDATE artists SET tsv_name = (to_tsvector('simple', coalesce(name, '')));

I thought that I'd be able to incorporate the unaccent package by simply replacing name by unaccent(name) in the above code, but that's throwing a syntax error in the create trigger code. How can I fix this to ensure that the tsvector column uses unaccented characters?

Upvotes: 1

Views: 1429

Answers (1)

lechin
lechin

Reputation: 86

I know this is a really old ticket, but along with @ErwinBrandstetter 's incredible answer (https://stackoverflow.com/a/11007216/5261969) and some PostgreSQL doc reading (https://www.postgresql.org/docs/9.3/functions-textsearch.html), I finally figured this out.

This is what my migration looks like (using Rails 5.2):


    # Erwin's wrapper code
    connection.execute(<<-EOSQL)
      CREATE OR REPLACE FUNCTION public.immutable_unaccent(regdictionary, text)
        RETURNS text LANGUAGE c IMMUTABLE STRICT AS
        '$libdir/unaccent', 'unaccent_dict';
    EOSQL

    connection.execute(<<-EOSQL)
      CREATE OR REPLACE FUNCTION public.f_unaccent(text)
        RETURNS text LANGUAGE sql IMMUTABLE STRICT AS
      $func$
      SELECT public.immutable_unaccent(regdictionary 'unaccent', $1)
      $func$;
    EOSQL

    # new trigger using f_unaccent
    connection.execute(<<-EOSQL)
      CREATE TRIGGER artists_name_tsvector_update BEFORE INSERT OR UPDATE
      OF name ON artists FOR EACH ROW EXECUTE PROCEDURE
      tsvector_update_trigger(
        tsv_name, 'public.f_unaccent(name)', name
      );

      UPDATE artists SET tsv_name = (to_tsvector('public.f_unaccent(name)', name));
    EOSQL

This update allows users to search with special characters or not to yield the same results (e.g "Ønders" and "Onders" search query both return the correct artist.)

Note: I am also using I18n.transliterate to localize the incoming query param.

Updated - upon running tests, I found that my UPDATE code was incorrect, so I had to alter my code to get it working. I'll keep the above in case it helps someone

Updated migration using @EvanCarroll 's answer https://stackoverflow.com/a/50595181/5261969 - I couldn't figure out how to get the update trigger to stop complaining using Erwin's code. :/

# create extension unaccent
    connection.execute(<<-EOSQL)
      CREATE TEXT SEARCH CONFIGURATION f_unaccent ( COPY = simple );
      ALTER TEXT SEARCH CONFIGURATION f_unaccent
        ALTER MAPPING FOR hword, hword_part, word
        WITH unaccent, simple;
    EOSQL

    # create updated trigger using unaccent for tsv_name
    # update existing tsv_names
    connection.execute(<<-EOSQL)
      CREATE TRIGGER artists_name_tsvector_update BEFORE INSERT OR UPDATE
      OF name ON artists FOR EACH ROW EXECUTE PROCEDURE
      tsvector_update_trigger(
        tsv_name, 'public.f_unaccent', name
      );

      UPDATE artists SET tsv_name = (to_tsvector('public.f_unaccent', name));
    EOSQL

Upvotes: 5

Related Questions