Reputation: 9269
If I want to search all data with the string léon
in the name
field, I use a query like this:
SELECT * FROM `my_table` WHERE `name` REGEXP 'léon'
This request is OK.
Now, I need to retrieve all rows with the string léon
or leon
or lèon
, so I tried:
SELECT * FROM `my_table` WHERE `name` REGEXP 'l[éeè]on'
This query doesn't work... I have no error, but no result. Do you have any ideas?
Upvotes: 3
Views: 183
Reputation: 180897
MySQL's regexes don't work reliably with multibyte character sets.
However, if you're using UTF-8, you can instead make use of the accent insensitive LIKE
and =
searches which looks like it may be what you're really after anyway.
Upvotes: 2
Reputation: 1598
From the docs:
The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.
What you could try is handling the multibyte character like this so it is handling 1 byte or 2. It won't be perfect because it will matchin on "lion" for example, but may get you closer.
REGEXP 'l.{1,2}on'
Another thing that might be affecting you is the way that REGEXP works in MySQL is that there are implied wildcards on the end because you are not specifying the the beginning and end so
REGEXP 'l[éeè]on'
is equivalent to
REGEXP '^.*l[éeè]on.*$'
This means that you might be matching MORE than what you expect. Sounds like your problem is the first scenario I mention
Upvotes: 0