findall not retruning all the results in Python 3.7

Question

I am trying to create list of tuples with the data after strings string1 and string3. But not getting expected result.

s = 'string1:1234string2string3:a1b2c3string1:2345string3:b5c6d7'
re.findall('string1:(\d+)[\s,\S]+string3:([\s\S]+',s)

Actual result:

[('1234', 'b5c6d7)']

Expected result:

[('1234', 'a1b2c3'), ('2345', 'b5c6d7')]

The fourth bird · Accepted Answer

You current regex uses [\s,\S]+ which is greedy and matches all characters until the end of the line.

You could make it non greedy and use a positive lookahead (?=string|$) for the last match that assert what follows is either string or the end of the line $.

string1:(\d+).*?string3:(.*?)(?=string|$)

import re 
s = 'string1:1234string2string3:a1b2c3string1:2345string3:b5c6d7'
print(re.findall('string1:(\d+).*?string3:(.*?)(?=string|$)',s))

Demo

findall not retruning all the results in Python 3.7

Answers (2)

Related Questions