Reputation: 187
I have a string s = '10000'
,
I need using only the Python re.findall to get how many 0\d0
in the string s
For example: for the string s = '10000'
it should return 2
explanation: the first occurrence is 10000 while the second occurrence is 10000
I just need how many occurrences and not interested in the occurrence patterns
I've tried the following regex statements:
re.findall(r'(0\d0)', s) #output: ['000']
re.findall(r'(0\d0)*', s) #output: ['', '', '000', '', '', '']
Finally, if I want to make this regex generic to fetch any number then any_number_included_my_number then the_same_number_again, how can I do it?
Upvotes: 0
Views: 169
Reputation: 22837
As I mentioned in my comment, you can use the following pattern:
(?=(0\d0))
How it works:
(?=...)
is a positive lookahead ensuring what follows matches. This doesn't consume characters (allowing us to check for a match at each position in the string as a regex would otherwise resume pattern matching after the consumed characters).(0\d0)
is a capture group matching 0
, then any digit, then 0
Your code becomes:
re.findall(r'(?=(0\d0))', s)
The result is:
['000', '000']
The python re.findall
method states the following
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.
This means that our matches are the results of capture group 1 rather than the full match as many would expect.
You can use the following pattern:
(\d)\d\1
How this works:
(\d)
captures any digit into capture group 1\d
matches any digit\1
is a backreference that matches the same text as most recently matched by capture group 1Your code becomes:
re.findall(r'(?=((\d)\d\2))', s)
print([n[0] for n in x])
Note: The code above has two capture groups, so we need to change the backreference to \2
to match correctly. Since we now have two capture groups, we will get tuples as the documentation states and can use list comprehension to get the expected results.
The result is:
['000', '000']
Upvotes: 2