Reputation: 4199
How to perform ORDER BY for string columns efficiently, when different collations are required, depending on the culture? That is, data for the users from different cultures are stored in the same table and the same column, but each user naturally wants to see it ordered according to it's locale (locale is of course known and fixed per each row in the table). And table can be very long, so column needs and index and can't be post-processed on the application side to the desired collation (it's database task to do heavy-lifting, right?).
For example, the utf8_general_ci
produces different results from utf8_swedish_ci
.
While I think the problem should be evident for any international project, I can't find any suitable solutions out there. Myself I can imaging only the following solutions, which are not that nice and I doubt nothing better can be done:
Now, if there were only one sortable string column, but there may be several. What is the intended and right way to solve this?
Upvotes: 1
Views: 697
Reputation: 1660
As long as you use the same character set (utf8 in your case) for column storage as well as for reading, you can use COLLATE some-utf8-collation
after the ORDER BY column-name
clause:
SELECT * FROM sometable ORDER BY somecolumn COLLATE utf8_swedish_ci
In my tests this produces different sorting than a german collation:
SELECT * FROM sometable ORDER BY somecolumn COLLATE utf8_german2_ci
Well, as long as the data contains relevant characters, e.g. german umlauts üöä. If not, you won't see a difference.
Multiple columns in the ORDER
clause each get their own COLLATE
term:
SELECT * FROM sometable
ORDER BY
somecolumn COLLATE utf8_german2_ci,
secondcolumn COLLATE utf8_german2_ci
Upvotes: 2