Reputation: 1744
I'm trying to build a search query which searches for a word in a string and finds matches based on the following criteria:
For example, if the word is 'php' the following strings would be matches:
But for instance it wouldn't match:
I've tried the following query:
SELECT * FROM candidate WHERE skillset REGEXP '^|[., ]php[., ]|$'
However that doesn't work, it returns every record as a match which is wrong.
Without the ^| and |$ in there, i.e.
SELECT * FROM candidate WHERE skillset REGEXP '[., ]php[., ]'
It successfully finds matches where 'php' is somewhere in the string except the start and end of the string. So the problem must be with the ^| and |$ part of the regexp.
How can I add those conditions in to make it work as required?
Upvotes: 0
Views: 55
Reputation: 190
Try '\bphp\b'
, \b
is a word boundary and might just be exactly what you need because it looks for the whole word php.
For MySQL, word boundaries are represented with [[:<:]] and [[:>:]] instead of \b, so use the query '[[:<:]]php[[:>:]]'. More info on word boundaries here.
Upvotes: 1
Reputation: 468
Well, you can play around a bit with regex101.com
Something I found that works for you but doesn't exactly follow your rules is:
/(?=[" ".,]?php[" ".,]?)(?=php[\W])/
This uses the lookahead operator, ?=, to do AND The first portion of the regex is
[" ".,]?php[" ".,]?
This will match anything that has a space, period, or comma before or after the php, but at most only one.
The section portion of the regex is
php[\W]
This will match anything that is php, followed by a non-character. In other words, it will NOT match php followed by a character, digit, or underscore.
It's not the perfect answer for your set of rules, but it does work with your sample data set. Play around on regex101.com and try to make a perfect one.
Upvotes: 1