Reputation: 1
I am trying to find two separate groups of text within a document using REGEX in an application. Example:
(facility services|MAFS|MFA|facility|facilities)
(agreement|lease)
I only want to identify documents that have a match to one word in both sets of text, such as facility
and agreement
. How would I write that in REGEX?
Upvotes: 0
Views: 210
Reputation:
This is commonly known as Out-Of-Order matching.
If you get into a situation where you have more than 2 sets, then the only
efficient way to do it is with an engine that does conditional constructs.
But, this is it for your 2 sets (?:.*?\b(?:(?(1)(?!))(facility|MAFS|MFA|facilities)|(?(2)(?!))(agreement|lease))\b){2}
Readable version
(?:
.*?
\b
(?:
(?(1)
(?!)
)
( # (1 start)
facility
| MAFS
| MFA
| facilities
) # (1 end)
|
(?(2)
(?!)
)
( agreement | lease ) # (2)
)
\b
){2}
Upvotes: 0
Reputation: 782166
If you're just looking for two matches, just search for both of them in either order using alternation.
((MAFS|MFA|facility|facilities)[\s\S]*(agreement|lease))|((agreement|lease)[\s\S]*(MAFS|MFA|facility|facilities))
If there are more patterns this doesn't scale well because of combinatorial explosion, so lookaheads are the solution. See Regular Expressions: Is there an AND operator?
Upvotes: 1