ppp
ppp

Reputation: 31

Python regex's fuzzy search doesn't return all matches when using the or operator

For example, when I use

regex.findall(r"(?e)(mazda2 standard){e<=1}", "mazda 2 standard")

, the answer is ['mazda 2 standard'] as usual.

But when I use

regex.findall(r"(?e)(mazda2 standard|mazda 2){e<=1}", "mazda 2 standard")

or

regex.findall(r"(?e)(mazda2 standard|mazda 2){e<=1}", "mazda 2 standard", overlapped=True)

, the output doesn't contain 'mazda 2 standard' at all. How to make the output contain 'mazda 2 standard' too?

Upvotes: 3

Views: 1117

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626927

See PyPi regex documentation:

By default, fuzzy matching searches for the first match that meets the given constraints. The ENHANCEMATCH flag will cause it to attempt to improve the fit (i.e. reduce the number of errors) of the match that it has found.

The BESTMATCH flag will make it search for the best match instead.

You get mazda 2 with your code because this match contains no errors.

So, use the BESTMATCH flag (an inline modifier option is (?b)):

>>> import regex
>>> regex.findall(r"(?be)(mazda2 standard|mazda 2){e<=1}", "mazda 2 standard")
['mazda 2 standard']
>>> 

Upvotes: 1

Related Questions