Reputation: 14967
Sqlite3 by default sorts only by ascii letters. I tried to look in google, but the only thing I found were informations about collations. Sqlite3 has only NOCASE
, RTRIM
and BIARY
collations. How to add support for a specific locale?
(I'm using it in Rails application)
Upvotes: 16
Views: 18013
Reputation: 14824
As Doug Currie answered, ordering accented characters correctly can be enabled by loading the SQLite "ICU" extension
The extension needs to be compiled, as explained in kiew's own answer here. However, the library name given in the ICU readme file and in kiew's answer didn't work for me. This other answer suggested using the name libicu.so
instead.
So this is what worked for me on Ubuntu 16.04 and on Debian 9.8 Stretch:
sudo apt install libicu-dev libsqlite3-dev dpkg-dev gcc make
apt-get source sqlite3
cd sqlite3-*/ext/icu # assuming you have only 1 sqlite3 source directory
gcc -fPIC -shared icu.c `icu-config --ldflags` -o libicu.so
sudo cp libicu.so /usr/local/lib/
sudo ldconfig
Update: in later versions, icu-config
is not available any more, so use pkg-config
instead. This worked in Debian 11 "bullseye":
sudo apt install libicu-dev libsqlite3-dev dpkg-dev gcc make pkg-config
apt-get source sqlite3
cd sqlite3-*/ext/icu # assuming you have only 1 sqlite3 source directory
gcc -fPIC -shared icu.c `pkg-config --libs --cflags icu-uc icu-io` -o libicu.so
sudo cp libicu.so /usr/local/lib/
sudo ldconfig
After that in sqlite3, you can
.load libicu
SELECT icu_load_collation('', 'ICU');
The 2 arguments to icu_load_collation
are locale and a custom name. The locale seems to be optional and can be left empty. The custom name is then visible with
PRAGMA collation_list;
And it can be used as in
SELECT col FROM tbl ORDER BY col COLLATE ICU;
Upvotes: 1
Reputation: 14967
I accepted Doug Currie answer, but I want to add some "algorithm" how to do it, because sqlite3 documentation is very strange (at least for me).
Ok, we have working sqlite3 and now:
Compile it:
gcc -shared icu.c `icu-config --ldflags` -o libSqliteIcu.so
It is for Linux. I also needed to install additional ICU development package:
sudo apt-get install libicu-dev
I'm working on 64 bit architecture and I get error with __relocation R_X86_64_32S__
(whatever it means :). GCC suggested adding -fPIC
to compile options and it helped.
Run sqlite3. We can load extension with command:
.load './libSqliteIcu.so'
Assuming that it is in the current directory, we can also specify whole path.
Create new collation:
SELECT icu_load_collation('pl_PL', 'POLISH');
The first parameter is desired locale and the second is it's (it can be whatever).
Now we can sort data with our new locale:
SELECT * FROM some_table ORDER BY name COLLATE POLISH;
And it is case insensitive!
Upvotes: 29
Reputation: 41170
SQLite supports integration with ICU. According to the Readme file,
sqlite/ext/icu/README.txt
the sqlite/ext/icu/
directory contains source code for the SQLite "ICU" extension, an
integration of the "International Components for Unicode" library with SQLite.
1. Features
1.1 SQL Scalars upper() and lower()
1.2 Unicode Aware LIKE Operator
1.3 ICU Collation Sequences
1.4 SQL REGEXP Operator
Upvotes: 13
Reputation: 154513
If you can't afford to compile the ICU extension you can have a UDF do the same. In PHP/PDO:
$pdo->sqliteCreateFunction('locale',
function ($data, $locale = 'root')
{
static $collators = array();
if (isset($collators[$locale]) !== true)
{
$collators[$locale] = new \Collator($locale);
}
return $collators[$locale]->getSortKey($data);
}
);
Example usage:
SELECT * FROM "table" ORDER BY locale("column", 'pt_PT');
I don't expect this approach to be as efficient as the native extension but it is surely more portable.
Upvotes: 3