isparling
isparling

Reputation: 603

Case insensitive unicode collation in MySQL

I've got a database where we store usernames with a capital first letter of each name -- ie, IsaacSparling. I'm trying to do case insensitive autocomplete against my MySQL (v5.1.46) db. Table has a charset of UTF8 and a collation of utf8_unicode_ci. I've done these tests against the utf8_general_ci collation as well.

Plain ASCII text works fine:

mysql> select username from users where username like 'j%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

mysql> select username from users where username like 'J%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

(names redacted, but they're there).

However, when I try to do the same for unicode characters outside the ASCII set, no such luck:

mysql> select username from users where username like 'ø%';
Empty set (0.00 sec)


mysql> select username from users where username like 'Ø%';
+-------------+
| username    |
+-------------+
| Ø*********  |
+-------------+
1 row in set (0.00 sec)

Some investigation has lead me to this: http://bugs.mysql.com/bug.php?id=19567 (tl;dr, this is a known bug with the unicode collations, and fixing it is at 'new feature' priority -- ie, won't be finished in any reasonable timeframe).

Has anybody discovered any effective workarounds that allow for case-insensitive searching for unicode characters in MySQL? Any thoughts appreciated!

Upvotes: 4

Views: 1676

Answers (4)

andriy.ca
andriy.ca

Reputation: 1

I just resolved the same problem using the query

show variables like '%char%';

My character_set_client was set to 'utf8', but character_set_connection and character_set_results were set to 'latin1'. Thus, the functions UPPER, LOWER, LIKE did not work as expected.

I just inserted the line

mysql_query("SET NAMES utf8");

right after connection to get the case-insensitive searching work.

Upvotes: 0

user530229
user530229

Reputation:

Have you tried using CONVERT? Something like

WHERE `lastname` LIKE CONVERT( _utf8 'ø%' USING latin1 )

might work for you.

Upvotes: 0

mariana soffer
mariana soffer

Reputation: 1853

IF what you care about is being able to order the field values by the text without caring if it is in upper or lower case I think the best thing you can do is when addressing the field instead of typing just username, type LOWER(username) username and then you can perfectly use an order by that field calling it by its name

Upvotes: 0

Zeke Kievel
Zeke Kievel

Reputation: 11

Works fine for me with version 5.1.42-community

Maybe your mysql client did not send the unicode characters properly. I tested with sqlYog and it worked just fine with both utf8_unicode_ci and utf8_general_ci collations

Upvotes: 1

Related Questions