Lisa
Lisa

Reputation: 959

Regex to match required/optional elements and nothing else

I need a regex to match two items in any order, but also allow two other optional elements. The regex should only match these two to four items, and nothing else.

For example, I want to match "high" and "unrounded" with optional "back" and "tense".

Input:

high back tense unrounded     # ==> match (two required elements + two optional)
high unrounded                # ==> match (just two required, no optional elements)
unrounded high                # ==> match (two required elements, any order)
high back unrounded           # ==> match (two required elements and one optional one)
tense unrounded back high     # ==> match (any order + optional elements)
lax unrounded                 # ==> no match (doesn't include one required element)
high back tense unrounded lax # ==> no match (includes more than the two required and two optional elements)

Her's my current Regex:

(?i)(?=.*high)((?=.*back))?((?=.*tense))?(?=.*unrounded)

It matches everything that I want but also matches things like the last example - which I don't want. Can I get it to NOT match with something that contains more than these two required elements and two optional elements?

Upvotes: 1

Views: 285

Answers (1)

41686d6564
41686d6564

Reputation: 19641

Well, if you insist on using regex, you can try something like this:

^(?=.*high)(?=.*unrounded)(?:(?:high|back|tense|unrounded)(?: |$))+$

Try it online.

Details:

  • ^(?=.*high)(?=.*unrounded) Makes sure the string contains both "high" and "unrounded".

  • (?: Start of the outer non-capturing group.

    • (?:high|back|tense|unrounded) A non-capturing group - matches any of these 4 words.

    • (?: |$) Matches a space character or asserts position at the end of the line.

  • )+ End of the outer non-capturing group and matches one or more instances of it.

  • $ Asserts position at the end of the line.


Note: If you need to make sure none of the words is repeated in the string, you can add a negative lookahead after each one of them. For example:

^(?=.*high)(?=.*unrounded)(?:(?:high(?!.*high)|back(?!.*back)|tense(?!.*tense)|unrounded(?!.*unrounded))(?: |$))+$

Demo.

Upvotes: 1

Related Questions