Reputation: 930
Given string 1.blah blah2.yada yada
I'd like to extract 1.blah blah
and 2.yada yada
. I tried this \d\..+
but that matches the entire string. \d\..+?
matches 1.b
and 2.y
. I just need to match the pattern lazily. Any ideas?
Upvotes: 1
Views: 170
Reputation: 627292
The .+
at the end of the pattern matches all 1+ chars other than line break chars up to the string/line end. .+?
at the end of the pattern only matches 1 char (but it is required) since +?
is a lazy quantifier requiring only 1 char to be present.
You may use
\d+\..*?(?=\d+\.|$)
See the regex demo. Add re.DOTALL
modifier if there can be line breaks inside the string.
Details
\d+
- 1+ digits\.
- a dot.*?
- any 0+ chars other than line break chars (if re.DOTALL
is used, even including line break chars), as few as possible, up to (but excluding) the first occurrence of...(?=\d+\.|$)
- (a positive lookahead matching either of the two alternatives:) 1+ digits and then .
or end of string.Python demo:
import re
rx = r"\d+\..*?(?=\d+\.|$)"
s = "1.blah blah2.yada yada3.yadddaaa"
print(re.findall(rx, s))
# => ['1.blah blah', '2.yada yada', '3.yadddaaa']
See the Python demo.
Upvotes: 1