Etienne
Etienne

Reputation: 12590

In Django with Postgresql 9.6 how to sort case and accent insensitive?

What I would like is the equivalent of using utf8_unicode_ci in MySQL. So if I have those strings (default sort order with Postgresql):

I wish they would be sorted like this (as with utf8_unicode_ci in MySQL):

This kind of sort is case insensitive, accent insensitive and ligatures are converted to multiple characters.

I know about unaccent and lower in Postgresql but I have no idea how to use them from Django.

Possible solutions with Django/Postgresql:

I don't think Full Text Search or Trigram could help me here because I'm not necessarily doing searches base on text but I need to get the good sort order.

Ideally queries should be fast so using another indexed column looks like a good avenue. But I wish to find a solution that I don't need to implement for every exisiting text column in my DB, that is easy to maintain, etc. Is there a best practice to do that?

Upvotes: 3

Views: 1093

Answers (2)

Mateus Padua
Mateus Padua

Reputation: 193

I did this way:

But you need enable in your postgresql the module 'unaccent' before this way: CREATE EXTENSION unaccent;

def get_value_ci(field):
    return Func(field, function='LOWER', template='UNACCENT(%(function)s(%(expressions)s))')

YoutModel.objects.order_by(get_value_ci('nome_your_field'))

and works, ;)

Upvotes: 0

Michel Milezzi
Michel Milezzi

Reputation: 11115

It isn't related to Django itself, PostgreSQL's lc_collate configuration determines this. I'd suggest you to review its value:

SHOW lc_collate;

The right thing to do is fix this configuration. Don't forget to take a look on related settings too (lc_ctype, etc.).

But if you cannot create another database with the right setting, try to explicit collate on ORDER like the following test case:

CREATE TEMPORARY TABLE table1 (column1 TEXT); 

INSERT INTO table1 VALUES('Barn'),
('beef'),
('bémol'),
('Bœuf'),
('boulette'),
('Bubble');

SELECT * FROM table1 ORDER BY column1 COLLATE "en_US"; --Gives the expected order
SELECT * FROM table1 ORDER BY column1 COLLATE "C"; --Gives "wrong" order  (in your case)

It's important to remember that PostgreSQL relies on operating system locales. This test case was executed on CentOS 7. More info here and here.

Upvotes: 3

Related Questions