jmasterx
jmasterx

Reputation: 54123

RegEx pattern for unicode usernames?

I am trying to make a function that will validate a username in MySQL:

FUNCTION `player_name_is_valid`(name TEXT CHARACTER SET utf8) RETURNS int(11)
BEGIN
declare result tinyint(1) DEFAULT 1;
declare reg tinyint(1) DEFAULT 1;

IF(CHAR_LENGTH(name) > 16 OR CHAR_LENGTH(name) < 3) THEN
SET result = 0; #name cannot exceed 16 characters, cannot be less than 3
END IF;

SELECT name REGEXP '^[a-z0-9_-]$' INTO reg;

if(reg = -) THEN
SET result = 0;
END IF;

RETURN result;
END

However, the regular expression portion is failing. It returns 0 no matter what. I tested it is 'Hello' and it returned 0, with 'Hell!0' it also returned 0.

What could be wrong?

Upvotes: 0

Views: 186

Answers (2)

mickmackusa
mickmackusa

Reputation: 47904

For the record, MySQL has since gotten its act together...

MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multibyte safe. (Prior to MySQL 8.0.4, MySQL used Henry Spencer's implementation of regular expressions, which operates in byte-wise fashion and is not multibyte safe.

Source: https://dev.mysql.com/doc/refman/8.0/en/regexp.html

Upvotes: 0

marekful
marekful

Reputation: 15351

MySQL's REGEX only performs pattern matches but cannot capture. It will always return 0 or 1.

Moreover, from the doc:

Warning The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

Upvotes: 2

Related Questions