Reputation: 49247
I am trying to convert a search query like this:
bridge AND (car OR boat)
Into a regex that would match against this:
My car goes over bridge.
I am close to getting it (I think), this is what I have so far:
.*(bridge).*(car|boat)
That doesn't work though, but this does:
.*(car|boat).*(bridge)
My regex seems to be dependent on the order of the words in the string. Is there a way to match parameters without caring what order they are in?
Upvotes: 0
Views: 74
Reputation: 3039
You could use lookahead assertions (?= ... )
to accomplish this. Such assertions would overcome the burden of permutations when dealing with alternation ( | ).
For example:
^(?=.*?\bbridge\b)(?=.*?\b(car|boat)\b)
Since assertions are "zero-width", in this example once either assertion is evaluated you are still at the beginning of the string. Effectively this pattern says "match the beginning of the string", and "make sure that both "bridge" and either "car or boat" are found at some point after the beginning of the string.
Each assertion would correspond to the AND part of your query; the OR would be handled by the alternation. This could change when your query changes, but holds for your example.
Upvotes: 1
Reputation:
Short answer: No, not in a single a regular expression. A regexp is for matching an ordered sequence of characters.
You could generate a pattern that explicitly allowed for both orderings, of course. I.e. if you want to match A or B in any order, you'd generate something like: (?:.*A.*B)|(?:.*B.*A)
. But to cover all permutations would yield a rather huge regexp if the number of terms grows.
A better solution is probably to match each term with a separate regular expression and combine matches yourself, e.g. by implementing a simple boolean expression tree.
Upvotes: 2