Reputation: 453
I want to use PostgreSQL 9.5.4 on AWS RDS, leveraging the full-text search, dictionaries with stop words, unaccented full-text search.
The context:
With 'unaccent' on, this full-text (Json) query will find 'F(e-acute)vrier' even if I did not enter the accent properly
psql>select * from proto_model.product where to_tsvector((body ->> 'description')) @@ to_tsquery('Fevrier');
With an English dictionary, the same search for 'the', 'any', 'you'... will find nothing as they are 'stop words' defined in the English dictionary and ignored.
The issue:
On my local Postgres, this is not an issue at all. On the managed AWS this is one. EC2+Docker on AWS would not be an issue of course but I focus on RDS Postgres for now.
Locally, the default value of default_text_search_config (get it with psql>show all
) is 'pg_catalog.english' which uses the English dictionary and stop words. On RDS this is 'pg_catalog.simple'.
1) In AWS, I cannot add a dictionary or modify one as you need file system access that you don't have. There is no programatical solution to create/update a dictionary AFAIK.
2) In AWS, as the 'postgres' user or even as the 'rds_superuser' that you can create, I cannot alter the global configuration
psql>ALTER SYSTEM SET default_text_search_config = 'pg_catalog.english';
ERROR: must be superuser to execute ALTER SYSTEM command
In addition, it is absent of the RDS Postgres parameter groups that you can associate to a new Postgres instance and you cannot add missing values!
Granting more to 'rds_superuser' (psql>grant all on schema public to ...
) does not help.
3) In AWS, as 'postgres' or 'rds_superuser', I can however set for my session, the current text configuration
psql>set default_text_search_config = 'pg_catalog.english';
SET
4) Sadly in AWS, as 'postgres' or 'rds_superuser', I cannot alter the search configuration (global) to ignore the accents. This works fine locally.
psql>ALTER TEXT SEARCH CONFIGURATION english ALTER MAPPING FOR hword, hword_part, word WITH unaccent, english_stem;
ERROR: must be owner of text search configuration english
5) In AWS, as 'postgres' or 'rds_superuser', I can create a new search configuration (English + Unaccent) but I cannot set it as the default even in my session!
psql>CREATE TEXT SEARCH CONFIGURATION english2 (copy=english);
CREATE...
psql>ALTER TEXT SEARCH CONFIGURATION english2 ALTER MAPPING FOR hword, hword_part, word WITH unaccent, english_stem;
ALTER...
psql>set default_text_search_config = 'pg_catalog.english2';
ERROR: invalid value for parameter "default_text_search_config": "pg_catalog.english2"
So it seems that I am cooked.
The best that I can see could be to associate automatically without a need for individual psql>set default_text_search_config = ...
my user connection to a set of configuration options
psql>alter role somerole set default_text_search_config = 'pg_catalog.english';
psql>select * from pg_user; (the option is present by default for all my connections under this role)
Outside of moving out of AWS RDS to EC2+Docker, do you know any solution to (4) or (5) providing me dictionary+unaccent?
Upvotes: 21
Views: 5101
Reputation: 121
You can change by altering the role parameters in the database like:
ALTER ROLE [role] IN DATABASE [database]
SET default_text_search_config TO 'pg_catalog.english';
Upvotes: 11
Reputation: 49
Don't rely on the default_text_search_config.
Instead, create your own text search configuration as you do in #5. Then use the two-argument form of to_tsvector
to specify that custom text search configuration instead of the default:
SELECT * from mytable where to_tsvector('myconfig', description)
@@ to_tsquery('cat & dog')
An additional benefit of the two-argument version of to_tsvector
is that it allows you to use an "expression index" to power text search, rather than a separate tsvector column in your table:
CREATE_INDEX mytable_tsv_idx ON mytable USING GIN
(to_tsvector('myconfig', description));
-- This query will use the index
SELECT * from mytable WHERE to_tsvector('myconfig', description)
@@ to_tsquery('cat & dog');
-- This query, despite setting the default config,
-- will not use the expression index.
SET default_text_search_config = 'myconfig';
SELECT * from mytable WHERE to_tsvector(description)
@@ to_tsquery('cat & dog');
https://www.postgresql.org/docs/9.5/static/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
Upvotes: 4