Reputation: 671
I am trying to find a way to list only the names that have an accent.
I am using the following example taken from this question https://dba.stackexchange.com/questions/94887/what-is-the-impact-of-lc-ctype-on-a-postgresql-database
select firstname from (values ('bernard'), ('bérénice'), ('béatrice'), ('boris'))
AS l(firstname)
order by firstname collate "C";
Current output is
firstname
-------
bernard
boris
béatrice
bérénice
Expected output is
firstname
-------
béatrice
bérénice
Any ida what I should put in the where statement?
Upvotes: 3
Views: 674
Reputation: 16377
It's possible this will yield more than you ask for, but if you want to find all records that contain Unicode characters, you can use a regular expression for "not ASCII":
select firstname
from (values ('bernard'), ('bérénice'), ('béatrice'), ('boris'))
AS l(firstname)
where
firstname ~ '[^[:ascii:]]'
Again, this will include more than just accents, but depending on your use case it might meet the need.
Upvotes: 3
Reputation: 644
You have to create an extension first:
CREATE EXTENSION unaccent;
Then you can check with this query:
SELECT l.firstname
FROM (VALUES ('bernard'), ('bérénice'), ('béatrice'), ('boris')) AS l(firstname)
WHERE l.firstname <> unaccent(l.firstname);
Upvotes: 2