Reputation: 22760
I have a table of VARCHAR column and I need to check it for a value:
Typical data values are:
Brize Norton (501, 622, 2624, 4624, 4626)
Wyton (7006, 7010, 7630)
Waddington (2503, 7006)
Honington (2623)
Marham (2620, 7010)
Leeming (607 & 609)
The only part I need to check is that it contains the full number only. I can not check just the number because LIKE '%607%'
will also incorrectly match 6070
or 2607
, so I check the number and a variation of wrappers as so:
I have this query:
SELECT id, name FROM aux WHERE aux.name REGEXP CONCAT('[(,\h]',:num, '[),\h]')
this is intended to catch any (( or, or <whitespace>
, a variable number value
, ) or , or <whitespace>
) in a VARCHAR column.
This works on some numbers but not on others;
An example :
:num = 2620
SELECT id, name FROM aux WHERE aux.name REGEXP CONCAT('[(,\h]',:num, '[),\h]')
Result:
"Marham (2620, 7010)"
but fails on other numbers:
:num = 7010
SELECT id, name FROM aux WHERE aux.name REGEXP CONCAT('[(,\h]',:num, '[),\h]')
Result:
(Nothing)
How can I tell the REGEXP to catch the data shaped as above ( or, or <whitespace>
, a variable number value
, ) or , or <whitespace>
.
I have tried EXPLAIN
on my query but that doesn't help me at see the REGEXP mechanism.
I have replaced \h
with \s
but this doesn't make a difference.
Upvotes: 0
Views: 189
Reputation: 627022
Your MariabDB does not support PCRE regex syntax, so only POSIX-compliant regex can be used. Neither \h
nor \s
are POSIX compliant, in POSIX world, \h
is "equivalent" to [:blank:]
and \s
to [:space:]
.
More POSIX/PCRE character class equivalent patterns:
POSIX Character Class | PCRE | Description |
---|---|---|
[:alnum:] |
[:alnum:] / [\p{L}\p{N}] |
Alphanumeric |
[:alpha:] |
\p{L} |
Alphabetic |
[:blank:] |
\h |
Whitespace |
[:cntrl:] |
[:cntrl:] / \p{Cc} / \p{C} |
Control characters |
[:digit:] |
\d |
Digits |
[:graph:] |
[:graph:] |
Graphic characters |
[:lower:] |
\p{Ll} |
Lowercase alphabetic |
[:print:] |
[:print:] |
Graphic or space characters |
[:punct:] |
[\p{P}\p{S}] |
Punctuation |
[:space:] |
\s |
Space, tab, newline, and carriage return |
[:upper:] |
\p{Lu} |
Uppercase alphabetic |
[:xdigit:] |
[:xdigit:] / [A-Fa-f0-9] |
Hexadecimal digit |
You can use
REGEXP CONCAT('[(,[:blank:]]', :num, '[),[:blank:]]')
REGEXP CONCAT('[(,[:space:]]', :num, '[),[:space:]]')
If you simply want to enforce numeric boundaries use
CONCAT('([^0-9]|^)',:num, '([^0-9]|$)')
CONCAT('([^[:digit:]]|^)',:num, '([^[:digit:]]|$)')
The regex details:
[(,[:space:]]
- a (
, ,
or any whitespace char[),[:space:]]
- a )
, ,
or any whitespace char[(,[:blank:]]
- a (
, ,
or a horizontal whitespace char[),[:blank:]]
- a )
, ,
or a horizontal whitespace char([^0-9]|^)
/ ([^[:digit:]]|^)
- any non-digit char or start of string([^0-9]|$)
/ ([^[:digit:]]|$)
- any non-digit char or end of string.Upvotes: 1
Reputation: 106413
Essentially, it's a matter of versions, as until version 10.0.5 MariaDB used the POSIX 1003.2 compliant regular expression library. This library didn't support \h
, \d
etc. character classes, using their POSIX variants - [:alpha:]
, [:digit:]
and so on.
In your case, however, it seems you might just replace \s
or \h
with a single whitespace in that character class:
REGEXP CONCAT('[(, ]', :num, '[), ]')
Upvotes: 3