hotzen
hotzen

Reputation: 2873

Forcing a Regular Expression to match optional groups

I want to search for a string "W foo X bar Y baz Z" in a text. W,X,Y,Z are unimportant separators and I must not search for them. foo, bar and baz are the words I am interested in. The order is not that important. I want to know how "good" my required words occur in the text.

I am trying the following

(?:\Qfoo\E)?.{0,3}(?:\Qbar\E)?.{0,3}(?:\Qbaz\E)?

My reasoning is:

This Regex is always matching since it only consists of optional groups but the resulting match is always empty, even if it could fully match all optional groups. However, I want to post-process the resulting match so I need it to capture as much as possible.

Can I force the Regex to try matching all groups as far as possible?

Or do you have any idea how to accomplish a search for several words, separated by something and later checking which words occurred to calculate some similarity?

Thank you very much

Upvotes: 1

Views: 1440

Answers (2)

Andy Petrella
Andy Petrella

Reputation: 4345

I think you'll have some difficulties to tackle your problem by simply using Regex.

I propose you to have a look at a powerful Scala feature, naming Parser Combinator.

Using it, you'll have to ability to combine the use of regex for matching inner elements, and parsing strategies to find them out.

Here is a clear and neat post where you'll find relevant information about this Parser Combinator.

What can be done is to see your content as

delim = "[a-z]{0,3}".r
value = "foo|bar|baz".r
expr = delim ~ value ~ expr

My 2c

Upvotes: 5

rtpHarry
rtpHarry

Reputation: 13125

First guess at this I would try a regular expression like this

(foo|bar|baz|anyothercombination)

and then use the matches count property

(Will just need to look this up and get back to you if you want a snippet)

Upvotes: 2

Related Questions