Reputation: 2656
My OpenCart
table collation is utf8_bin
, unfortunately I can't search for product names with accent in their name. I searched on Google and just found that the collation must be utf8_general_ci
for accent compatible and case insensitive search.
What If I add collate declaration to the search query?
SELECT *
FROM `address`
COLLATE utf8_general_ci
LIMIT 0 , 30
Does it have any (bad) side effect? I red about problems with indexing, performance? Or it is totally safe?
Upvotes: 13
Views: 34678
Reputation: 171
This might help: UTF-8: General? Bin? Unicode?
Please note that utf8_bin
is also case sensitive. So I would go for altering table collation to utf8_general_ci
and have peace of mind for the future.
Upvotes: 0
Reputation: 16968
In using of COLLATE in SQL statements, I don't find that usage, Anyway for explaining about your main question of effects of using collations I found some tips, but at first:
From dev.mysql.com:
Nonbinary strings (as stored in the
CHAR
,VARCHAR
, andTEXT
data types) have a character set and collation. A given character set can have several collations, each of which defines a particular sorting and comparison order for the characters in the set.
With multiple operands, there can be ambiguity. For example:
SELECT x FROM T WHERE x = 'Y';
Should the comparison use the collation of the column x
, or of the string literal 'Y'
? Both x
and 'Y'
have collations, so which collation takes precedence?
Standard SQL resolves such questions using what used to be called “coercibility” rules. [3]
ORDER BY
-[also in WHERE
]- cannot use any INDEX
; hence it could be surprisingly inefficient. [4]utf8_general_ci
will almost certainly perform slower in comparisons than utf8_bin
due the extra lookups/computation required).Upvotes: 2
Reputation: 2583
I'm afraid you have to consider the side effects on query performance, especially those using indexes. Here is a simple test:
mysql> create table aaa (a1 varchar(100) collate latin1_general_ci, tot int);
insert into aaa values('test1',3) , ('test2',4), ('test5',5);
mysql> create index aindex on aaa (a1);
Query OK, 0 rows affected (0.59 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> desc aaa;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| a1 | varchar(100) | YES | MUL | NULL | |
| tot | int(11) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
2 rows in set (0.53 sec)
mysql> explain select * from aaa where a1='test1' ;
+----+-------------+-------+------+---------------+--------+---------+-------+--
----+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | r
ows | Extra |
+----+-------------+-------+------+---------------+--------+---------+-------+--
----+-----------------------+
| 1 | SIMPLE | aaa | ref | aindex | aindex | 103 | const |
1 | Using index condition |
+----+-------------+-------+------+---------------+--------+---------+-------+--
----+-----------------------+
1 row in set (0.13 sec)
mysql> explain select * from aaa where a1='test1' collate utf8_general_ci;
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows
| Extra |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
| 1 | SIMPLE | aaa | ALL | NULL | NULL | NULL | NULL | 3
| Using where |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
1 row in set (0.06 sec)
You can see that MySQL is stopping using the index on a1 when you search it using another collation, which can be a huge problem for you.
To make sure your indexes are being used for queries, you may have to change your column collation to the most frequently used one.
Upvotes: 5
Reputation: 142503
If practical, change the column definition(s).
ALTER TABLE tbl
MODIFY col VARCHAR(...) COLLATE utf8_general_ci ...;
(You should include anything else that was already in the column definition.) If you have multiple columns to modify, do them all in the same ALTER (for speed).
If, for some reason, you cannot do the ALTER
, then, yes, you can tweak the SELECT
to change the collation:
The SELECT
you mentioned had no WHERE
clause for filtering, so let me change the test case:
Let's say you have this, which will find only 'San Jose':
SELECT *
FROM tbl
WHERE city = 'San Jose'
To include San José
:
SELECT *
FROM tbl
WHERE city COLLATE utf8_general_ci = 'San Jose'
If you might have "combining accents", consider using utf8_unicode_ci. More on Combining Diacriticals and More on your topic.
As for side effects? None except for on potentially big one: The index on the column cannot be used. In my second SELECT
(above), INDEX(city)
is useless. The ALTER
avoids this performance penalty on the SELECT
, but the one-time ALTER
, itself, is costly.
Upvotes: 4