Reputation: 33
I want to match a pattern with a string including pure numbers, such as '2324235235980980' with a pattern like as described below:
The pattern is '2-6-8-7-4', in which the pattern starts with 2, transit to 6, either self-loop at 6 or transit to 8, then it could go back and forth between 6 and 8, could self-loop at 8, or could transit to 7. And the same thing for 7. One more thing for 7 is 7-8-6-8-7 could happen. Finally, 7 could reach 4, once it reaches 4, the pattern is done. During the process, if it reaches out to other points, then it has to start with 2 again to be counted. I use
import re
re.findall(r'(2((6+8+)+)7)', test_string)
the output includes '2666686888668887', but when I add 4, I don't know the syntax to compile this. Has anyone an idea? Thanks a lot!
Upvotes: 3
Views: 222
Reputation: 18950
I think this is easier achieved than initially expected:
2-followed-by-6-followed-by-6|8-followed-by-6|8|7-followed-by-4.
The only not so obvious part is to make the pattern lazy.
Here is an even better pattern:
2-followed-by-(not7)6|8|(not6)7-followed-by-4.
Upvotes: 1
Reputation: 1
I don't know if I understand what you need, but maybe this can work for you:
string = "2666686888668887748926874"
index = [(m.start(0), m.end(0)) for m in re.finditer(r'2(6+8+)+7+\1?4', string)]
print(index)
Prints: [(0, 18), (20, 25)].
Is a list of tuples with the start and end index for every occurrence.
Upvotes: 0