Reputation: 4907
I have a django project where I am sorting a queryset by this CharField
:
name = models.CharField(max_length=50, unique=True)
On my the default django db.sqlite3 server, it sorts as I want (according to ASCII value) where objects with names beginning with a tilde ~ character appear at the bottom of the list:
item
jitem
litem
~kitem
but on my production postgresql server, they appear like this:
item
jitem
~kitem
litem
and it appears that the tilde is ignored when sorting.
How can I fix this?
Upvotes: 1
Views: 695
Reputation: 324265
This is normal behaviour. Your default collation is one that ignores punctuation and other special characters when sorting.
If you want to do a simple sort where every character is considered based on its ordinal value in the current text encoding, use COLLATE "C"
. This may be applied at the database, column, index or query level.
You can't change the collation of an existing database database-wide, but you can change columns and indexes.
If you want to change it for the whole database you'll need to pg_dump
the database, drop the database, and re-create the database with COLLATE "C"
as one of the CREATE DATABASE
options. I do not recommend doing this, though, since it leads to what users will consider to be wrong sort order for real natural language text.
Instead, change it on a per-column basis, e.g.
ALTER TABLE collate_demo ALTER COLUMN test_col TYPE text COLLATE "C";
(Keep the same type as before, just change the collation)
Note that you can also specify the collation you want for individual queries, though Django might not let you do that. E.g.
SELECT * FROM collate_demo ORDER BY test_col COLLATE "C";
or
SELECT * FROM collate_demo WHERE test_col > 'A' COLLATE "C" ORDER BY test_col COLLATE "C";
The COLLATE
clause applies only to the particular comparison operator or ORDER BY
it's specified with. It isn't a modifier for the whole query.
Note that even if two strings are equal according to national language collation rules they'll never be reported as equal by PostgreSQL. It sorts by utf-8 representation byte-order if the natural language collation says they sort the same but they're not byte-wise identical.
There's a Django ticket open for COLLATE
support, which has been closed as almost a duplicate of icontains can be case-sensitive on MySQL. It seems to have languished indefinitely. I've added a note strongly encouraging implementation of COLLATE
as the right way to solve the problem.
Upvotes: 3