user3052443
user3052443

Reputation: 842

Mysql regexp fails

I'm trying to find any records in table customers where the name does not contain letters. The following is what I am using. When ran, it doesn't return any records found. Would someone point out my mistake, please?

    table customers {
    name = Еarnings on thе Intеrnet from
    }

    SELECT name from customers WHERE name NOT REGEXP '[a-zA-Z]' ; 

Upvotes: 1

Views: 87

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562300

If I compare a string using REGEXP '[a-zA-Z]', it matches if any character in the string matches a letter in that regexp. The string you show does contain letters like a, r, n, i, g, s. Any one of those is enough to satisfy the REGEXP comparison.

mysql> select 'Еarnings on thе Intеrnet from' REGEXP '[a-zA-Z]' as ok;
+----+
| ok |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

The negation NOT REGEXP is the same as NOT (expr REGEXP pattern). It simply reverses the result 1 for 0, or 0 for 1.

mysql> select 'Еarnings on thе Intеrnet from' NOT REGEXP '[a-zA-Z]' as ok;
+----+
| ok |
+----+
|  0 |
+----+

You said you want to match names that do not contain letters. What I think you mean is that you want to match names that contain any characters that are not letters, which is a different test.

mysql> select 'Еarnings on thе Intеrnet from' REGEXP '[^a-zA-Z]' as ok;
+----+
| ok |
+----+
|  1 |
+----+

The characters Ð • µ are not in the range [a-zA-Z] and the way to express the complementary range of characters is to use [^a-zA-Z]. That is, with the ^ character inside the square brackets, a character would match if it is NOT one of the characters in that range.

See also https://dev.mysql.com/doc/refman/8.0/en/regexp.html#regexp-syntax under the item:

  • [a-dX], [^a-dX]

Re your comment:

I tested checking for the characters you mention, including space, apostrophe, dot, and dash:

mysql> select 'Mr. Tim O''Toole' regexp '[^a-zA-Z \'.-]' as ok;
+----+
| ok |
+----+
|  0 |
+----+

mysql> select 'Mr. Tim $ O''Toole' regexp '[^a-zA-Z \'.-]' as ok;
+----+
| ok |
+----+
|  1 |
+----+

You don't need to put a backslash before the -, but it does need to be handled specially:

To include a literal - character, it must be written first or last.

This is in the documentation I linked to.

Upvotes: 2

Related Questions