Reputation: 63994
I have the following string:
myst="Cluster 2 0 13aa,>FZGRY:07872:11201...*1 13aa,>FZGRY:08793:13012...at100.00%2 13aa,>FZGRY:04065:08067...at100.00%"
What I want to do is to extract content bounded by >
and ...
. into a list.
yielding:
['FZGRY:07872:11201','FZGRY:08793:13012', 'FZGRY:04065:08067']
But why this line doesn't do the job:
import re
mem = re.findall(">(.*)\.\.\.",myst)
mem
What's the right way to do it?
Upvotes: 0
Views: 39
Reputation: 26667
You can use look arounds to do this.
>>> re.findall(r'(?<=>)[^.]+(?=[.]{3})', myst)
['FZGRY:07872:11201', 'FZGRY:08793:13012', 'FZGRY:04065:08067']
Regex
(?<=>)
Positive look behind. Checks if the string is preceded by >
[^.]+
Matches anything other than .
, +
matches one or more.
(?=[.]{3})
Positive look ahead. Check if the matched string is followed by ...
What is wrong with your regex?
>(.*)\.\.\.
Here the .*
is greedy and will try to match as much as possible. Add a ?
at the end to make it non greedy.
>>> re.findall(">(.*?)\.\.\.",myst)
['FZGRY:07872:11201', 'FZGRY:08793:13012', 'FZGRY:04065:08067']
Upvotes: 3