Reputation: 630
We have a mysql table with a column that has alphanumeric text. We have a unique use case where we need to find every row in the table where this column has text that contains:
As such, these would be OK:
SCAN
SCATTCC
NCAS
NTTAAS
These would NOT be OK:
CATI (does not contain S, and contains I which is outside of S, T, C, A, N)
SCANNF (contains F which is outside of S, T, C, A, N)
NCASO (contains O which is outside of S, T, C, A, N)
..
We tried something like this but it doesn't work:
[tacn]*[s]+(?![^stacn])+
Upvotes: 0
Views: 304
Reputation: 104
Use this regex for pattern matching :
pattern = "^(S+[TACN]+S*[TACN]*)+$|^(S*[TACN]+S+[TACN]*)+$"
Upvotes: 0
Reputation: 147216
MySQL 8.0.4+
From MySQL 8.0.4 onwards, MySQL regexp support changed to the ICU library, and lookaheads are supported. For those versions, this regex will meet your requirements:
'^(?=.*s)(?=.*[acnt])(?!.*[^acnst])'
It uses 3 lookaheads:
(?=.*s)
asserts there is an S
in the string;(?=.*[acnt])
asserts there is at least one of [ACNT]
in the string;(?!.*[^acnst])
asserts that there are no characters other than [ACNST]
in the string.MySQL prior to 8.0.4
This regex will give you the results you want:
^[tacn]*(s[cant]|[tacn]s)[cant]*$
It looks for either
S
preceded by zero or more of [TACN]
and followed by one or more of [TACN]
; or S
preceded by one or more of [TACN]
and followed by zero or more of [TACN]
Query:
SELECT str,
str REGEXP '^[tacn]*(s[cant]|[tacn]s)[cant]*$' AS `match`
FROM test
Output:
str match
SCAN 1
SCATTCC 1
NCAS 1
NTTAAS 1
CATI 0
SCANNF 0
NCASO 0
CANT 0
S 0
SS 0
TS 1
SC 1
STS 0
If you want to allow S
to occur more than once, just change the regex to
^[stacn]*(s[cant]|[tacn]s)[scant]*$
This changes the result for STS
to 1
, while leaving the result for SS
as 0
since it doesn't contain a character other than S
.
Upvotes: 3
Reputation: 3882
This assumes that s
can appear multiple times just like the other allowed characters. The requirements do not explicitly exclude that possibility although the examples do not include such an example. I do not interpret "One letter for sure" to mean "appears only once".
^(s+[tacn][stacn]*|[tacn]+s[stacn]*)$
Upvotes: 2