Loading custom unaccent rules into PostgreSQL Amazon RDS

I use PostgreSQL with unaccent extension in order to handle German umlauts.
By default, the mapping works in the following way: Müller -> Muller, but this is not natural. Ü should be mapped into UE, Ö should be mapped into OE, but by default it is mapped to O.

In order to handle this, I edited the unaccent.rules file in /usr/local/share/postgresql/tsearch_data, as suggested by some german colleagues that stumbled upon this problem as well.

So, essentially, upon creating a PostgreSQL extension, in parallel, on the container initialization, I mount a custom rule file and replace it with the original one, and, voilà, I have a working solution! All I have to do is just to wrap the column name like this:

SELECT * FROM students
WHERE unaccent(name) ILIKE unaccent('Muel%')

This works on my local machine and all the environments where the DB lives inside a Docker container, but I'm not sure how I can make this work on Amazon RDS which is our production environment (since I don't have the access to the file system).

Upvotes: 0

Views: 153

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247800

You won't be able to create a custom “unaccent” dictionary in a hosted database, since that requires file system access.

I recommend that you use a custom function like

CREATE FUNCTION normalized(text) RETURNS text
   IMMUTABLE PARALLEL SAFE
   RETURN unaccent(
             replace(
                replace(
                   replace(
                      replace(
                         lower($1), 'ö', 'oe'
                      ), 'ü', 'ue'
                   ), 'ä', 'ae'
                ), 'ß', 'ss'
             )
          );

Then you could create a text search index with

CREATE INDEX ON tab USING gin (to_tsvector('german', normalized(doc)));

You would apply the same normalization to documents and search strings:

SELECT ... FROM tab
WHERE to_tsvector('german', normalized(doc)) @@
      to_tsquery('german', normalized('Schöffen'));

Upvotes: 1

Related Questions