Reputation: 19
I've googled and found nothing so far 'cause the most problems with this seems to relate to the character set, which seems to be correct in my case.
Situation:
I want to count People with different Emojis in their name but it seems to be that the query can't differentiate between Emojis.
Query:
SELECT Sum(CASE WHEN u.nick LIKE '%👾%'
THEN 1
ELSE 0
END) AS 'NF'
,Sum(CASE WHEN u.nick LIKE '%🌱%'
THEN 1
ELSE 0
END) AS 'T'
FROM users u
-->
|NF |T | |NF|T |
--------- Should be: -------
|128|128| |76|52|
The following Query will Result in a list of anybody who has a unicode-sign in his name and not only those with the 👾-Emoji. This supports my thesis, that the Query doesn't want to differentiate between any unicode-signs.
SELECT *
FROM users u
WHERE u.nick LIKE '%👾%'
Additional Information:
Server: Localhost via UNIX socket
Server-Typ: MariaDB
Server-Version: 10.1.48-MariaDB-0ubuntu0.18.04.1 - Ubuntu 18.04
Protokoll-Version: 10
Server-Zeichensatz: UTF-8 Unicode (utf8mb4)
Datenbank-Client Version: libmysql - mysqlnd 7.4.22
PHP-Erweiterung: mysqli Dokumentation curl Dokumentation mbstring Dokumentation
PHP-Version: 7.4.22
Table "user" Kollation: utf8mb4_unicode_ci
Row nick-Typ: varchar(255)
Row nick-Kollation: utf8mb4_general_ci
Does anybody has a clue where the problem could be?
Upvotes: 1
Views: 60
Reputation: 2562
It could be that this is caused by the Sushi-Beer problem which treats all emojis as equal. If you want to uniquely identify each character, you can use the utf8mb4_bin
collation:
MariaDB [(none)]> SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;
Query OK, 0 rows affected (0.000 sec)
MariaDB [(none)]> SELECT '🍣' = '🍺';
+-----------+
| '?' = '?' |
+-----------+
| 1 |
+-----------+
1 row in set (0.000 sec)
MariaDB [(none)]> SET NAMES utf8mb4 COLLATE utf8mb4_bin;
Query OK, 0 rows affected (0.000 sec)
MariaDB [(none)]> SELECT '🍣' = '🍺';
+-----------+
| '?' = '?' |
+-----------+
| 0 |
+-----------+
1 row in set (0.000 sec)
Upvotes: 1