Reputation: 1667
I'm not good at regex, trying to make 2 regex.
Regex1:
All specified words in any order but nothing else. (repetition allowed).
Regex2:
All specified words in any order but nothing else. (repetition not allowed).
Words:
aaa, bbb, ccc
Strings:
aaa ccc bbb
aaa ccc
aaa bbb ddd ccc
bbb aaa bbb ccc
Regex1 evaluate above strings as:
true -> all word present in any order
false -> bbb is missing
false -> unknown word 'ddd'
false -> repetition not allowed
Regex2 evaluate above strings as:
true -> all word present in any order
false -> bbb is missing
false -> unknown word 'ddd'
true -> all word present in any order and repetition is allowed
My Attempt
/^(?=.*\baaa\b)(?=.*\bbbb\b)(?=.*\bccc\b).*$/
Asking for learning purpose so please elaborate it.
Upvotes: 15
Views: 1208
Reputation: 18490
Without repetition regex101
^(?:(aaa|bbb|ccc)(?!.*?\b\1) ?\b){3}$
And with repetition regex101
^(?=.*?\baaa)(?=.*?\bbbb)(?=.*?\bccc)(?:(aaa|bbb|ccc) ?\b)+$
Two more ideas. Regex explanation at regex101 on the right side.
Upvotes: 6
Reputation: 1809
For Regex 1:
var re = /^(?=.*?\baaa\b)(?=.*?\bbbb\b)(?=.*?\bccc\b)\b(?:aaa|bbb|ccc)\b(?: +\b(?:aaa|bbb|ccc)\b)*$/;
var res = document.getElementById('result');
res.innerText += re.test('aaa ccc bbb');
res.innerText += ', ' + re.test('aaa ccc ddd');
res.innerText += ', ' + re.test('aaa ddd bbb');
res.innerText += ', ' + re.test('ccc bbb ccc');
<div id="result"></div>
Your code already does part of the trick. Your positive lookaheads check that all words appear somewhere, however not, that they are the only words present. To achieve this, I added the circumflex (^) at the beginning to detect the start of the string. Then, the non capturing group of \b(?:aaa|bbb|ccc)\b
, to detect the first instance of any word.
This is then followed by any number of words, preceded by at least one space (?:\s+\b(?:aaa|bbb|ccc)\b)*
, basically the same pattern, but with the \s+ in front, and wrapped in a *. And then we need the string to end somewhere. This is done with the dollar sign $
.
For Regex 2:
The basic strategy is the same. You would just check with a negative lookahead, that the matched string does not exist again:
//var re = /^(?=.*?\baaa\b)(?!.*?\baaa\b.*?\baaa\b)(?=.*?\bbbb\b)(?!.*?\bbbb\b.*?\bbbb\b)(?=.*?\bccc\b)(?!.*?\bccc\b.*?\bccc\b)\b(?:aaa|bbb|ccc)\b(?:\s+\b(?:aaa|bbb|ccc)\b)*$/;
// optimized version, see comments
var re = /^(?=.*?\baaa\b)(?=.*?\bbbb\b)(?=.*?\bccc\b)(?!.*?\b(\w+)\b.*?\b\1\b)\b(?:aaa|bbb|ccc)\b(?: +\b(?:aaa|bbb|ccc)\b)*$/;
var res = document.getElementById('result');
res.innerText += re.test('aaa ccc bbb');
res.innerText += ', ' + re.test('aaa ccc ddd');
res.innerText += ', ' + re.test('aaa bbb aaa');
res.innerText += ', ' + re.test('aaa ccc bbb ccc');
<div id="result"></div>
First, we have the positive lookahead (?=.*?\bword\b)
to see that word exists. We follow that by the negative lookahead (?!.*?\baaa\b.*?\baaa\b)
to see, the word does not exist multiple times. Repeat for all words. Presto!
Update: Instead of checking the specific words aren't repeated, we can also check that NO word is repeated by using the (?!.*?\b(\w+)\b.*?\b\1\b)
construct. This makes the regex more concise. Thanks to @revo for pointing it out.
Upvotes: 3
Reputation: 2196
Do not use regex for uniqueness.
But for separate words in regex, you can use \b
Example: /\b(word1|word2|word3)\b/
Upvotes: 1
Reputation: 135
why do you need regex to perform this function though? you could achieve what you want easily by first splitting the strings with delimiter ",". You can then create a dictionary object with the words that you are seeking as the keys and values defaulted to -1
Regex 2 can be achieved by looping through the input words and check if they exists as keys in the dictionary object. Regex 1 can be achieved similarly, just that when a key is matched to the input word, its value would then be changed to 1 and when it is next visited, a false match can be returned.
Upvotes: 2