Get substring in Python

Question

I have variables which represent email subject with these strings:

Snap: Processor
 'ir.basisswap-1702|sydney-ir.basisswap-ricsxml-location_mapping' for
 '20181231' failed [Production2]

and Snap: 'ir.broker.caplet.vol' RBS data valucheck failed [production]

Desired output:

I want to get values between Snap: and failed

Processor 'ir.basisswap-1702|sydney-ir.basisswap-ricsxml-location_mapping' for '20181231' and 'ir.broker.caplet.vol' RBS data valucheck

regex1 = r'Snap:\s*(\S+)'
          a=re.findall(regex1 ,mail["Subject"])

Actual output:

Processor for first and ir.broker.caplet.vol for second

Barmar · Accepted Answer

\S+ only matches a sequence of non-whitespace characters, so the match ends at the next space.

You want to match until the word failed, so use:

regex1 = r'Snap:\s*(.+?)\s+failed'

You need to use a non-greedy +? quantifier so that it only matches up to the first failed.

If the subjects contain newline characters, you should also use the re.DOTALL flag so that . will match newline.

Answers (1)