Reputation: 515
I am trying to find a pattern using re to find a sequence of numbers followed by some key words.
string =" 12390 total income stated in red followed by 567 total income stated in blue."
pattern = re.match("\s*\d{1,2}\s* total income",string)
I tried the pattern, but it is not good. I want at the end to get these results: "12390 total income" and "567 total income".
Upvotes: 0
Views: 300
Reputation: 5302
If you have several space (say 1 or 2 etc) between number and total income in that case use non-capturing group construct.
Say string is
string = '12390total income stated in red followed by 567 total income stated in blue.'
Then try as below
myresult = re.findall(r"\d+(?:\s*?total income)",string)
Extracts
['12390total income', '567 total income']
Then use replace
to remove extra space.
enter code here
Upvotes: 0
Reputation: 174844
You need to use re.findall
and change the pattern \d{1,2}
to \d+
(one or more digit chars), since \d{1,2}
should match a min of 1 and max of 2 digits only.
result = re.findall(r"\d+ total income",string)
Note that match
tries to match from the begining of the string where findall
should do a global match.
Upvotes: 3