Reputation: 367
I want to find patterns in string as follows,
a = "3. ablkdna 08. 15. adbvnksd 4."
The expected patterns are like below,
match = "3. "
match = "4. "
I want to exclude the patterns,
([0-9]+\.[\s]*){2,}
But only find the patterns of length 1. not 08.
and 15.
.
How should I implement this?
Upvotes: 0
Views: 62
Reputation: 22012
The following regex will work for given two examples:
import re
p = re.compile(r'(?<!\d\.\s)(?<!\d)\d+\.(?!\s*\d+\.)')
a = "3. ablkdna 08. 15. adbvnksd 4."
m = re.findall(p, a)
print(m)
# prints ['3.', '4.']
a = "3. (abc), adfb 8. 1. adfg 4. asdfasd"
m = re.findall(p, a)
print(m)
# prints ['3.', '4.']
Apparently the regex above is not complete and there are many exceptions to allow "false-positive".
In order to write a complete regex which excludes an arbitrary pattern,
we will need to implement the absent operator (?~exp)
which was
introduced in Ruby 2.4.1 and not available in Python as of now.
As an alternative, how about a two step solution:
m = re.findall(r'\d+\.\s*', re.sub(r'(\d+\.\s*){2,}', '', a))
which may not be elegant.
Upvotes: 1