Reputation: 11
I have a text example like
0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87
I want to detect the consecutive sequences that start 0s.
So, the expected output should be 0s11 0s12 0s33
, 0sgfh 0s1 0s22 0s87
I tried using regex
(0s\w+)
but that would detect each 0s11
, 0s12
, 0s33
, etc. individually.
Any idea on how to modify the pattern?
Upvotes: 0
Views: 60
Reputation: 163207
To get those 2 matches where there are at least 2 consecutive parts:
\b0s\w+(?:\s+0s\w+)+
Explanation
\b
A word boundary to prevent a partial word match0s\w+
Match os
and 1+ word chars(?:\s+0s\w+)+
Repeat 1 or more times whitespace chars followed by 0s
and 1+ word charsIf you also want to match a single occurrence:
\b0s\w+(?:\s+0s\w+)*
Note that \w+
matches 1 or more word characters so it would not match only 0s
Upvotes: 1
Reputation: 5963
Should be doable with re.findall()
. Your pattern was correct! :)
import re
testString = "0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87"
print(re.findall('0s\w', testString))
['0s11', '0s12', '0s33', '0sgfh', '0s1', '0s22', '0s87']
Hope this helps!
Upvotes: 0