Reputation: 2873
I want to search for a string "W foo X bar Y baz Z" in a text. W,X,Y,Z are unimportant separators and I must not search for them. foo, bar and baz are the words I am interested in. The order is not that important. I want to know how "good" my required words occur in the text.
I am trying the following
(?:\Qfoo\E)?.{0,3}(?:\Qbar\E)?.{0,3}(?:\Qbaz\E)?
My reasoning is:
This Regex is always matching since it only consists of optional groups but the resulting match is always empty, even if it could fully match all optional groups. However, I want to post-process the resulting match so I need it to capture as much as possible.
Can I force the Regex to try matching all groups as far as possible?
Or do you have any idea how to accomplish a search for several words, separated by something and later checking which words occurred to calculate some similarity?
Thank you very much
Upvotes: 1
Views: 1440
Reputation: 4345
I think you'll have some difficulties to tackle your problem by simply using Regex.
I propose you to have a look at a powerful Scala feature, naming Parser Combinator.
Using it, you'll have to ability to combine the use of regex for matching inner elements, and parsing strategies to find them out.
Here is a clear and neat post where you'll find relevant information about this Parser Combinator.
What can be done is to see your content as
delim = "[a-z]{0,3}".r
value = "foo|bar|baz".r
expr = delim ~ value ~ expr
My 2c
Upvotes: 5
Reputation: 13125
First guess at this I would try a regular expression like this
(foo|bar|baz|anyothercombination)
and then use the matches count property
(Will just need to look this up and get back to you if you want a snippet)
Upvotes: 2