Regex match only if multiple patterns found (python)

Question

I'm trying to extract data from sentences such as:

"monthly payment of 525 and 5000 drive off"

using a python regex search function: re.search()

My regex query string is as follows for down payment:

match1 = "(?P\d+)\s*(|\$|dollars*|money)*\s*" + \
         "(down|drive(\s|-)*off|due\s*at\s*signing|drive\s*-*\s*off)*"

My problem is that it matches the wrong numerical value as down payment, it gets both 525, and 5000.

How can I improve my regex string such that it only matches an element if another element is successfully matched as well?

In this case, for example, both 5000 and drive-off matched so we can extract 5000 as down_payment, but 525 did not match with the any down payment values, so it should not even consider the 525.

Clearer explanation here

Wiktor Stribiżew · Accepted Answer

I suggest removing the final * quantifier to match exactly one occurrence of the pattern:

(?P\d+)\s*(?:\$|dollars*|money)?\s*(down|drive[\s-]*off|due\s*at\s*signing|drive\s*-*\s*off)

See the regex demo

Regex match only if multiple patterns found (python)

Answers (1)

Related Questions