Reputation: 559
I'm new to Python and still learning about regular expressions, so this question may sound trivial to some regex expert, but here you go. I suppose my question is a generalization of this question about finding a string between two strings. I wonder: what if this pattern (initial_substring + substring_to_find + end_substring) is repeated many times in a long string? For example
test='someth1 var="this" someth2 var="that" '
result= re.search('var=(.*) ', test)
print result.group(1)
>>> "this" someth2 var="that"
Instead, I'd like to get a list like ["this","that"]
.
How can I do it?
Upvotes: 6
Views: 17024
Reputation: 25779
Use re.findall()
:
result = re.findall(r'var="(.*?)"', test)
print(result) # ['this', 'that']
If the test
string contains multiple lines, use the re.DOTALL
flag.
re.findall(r'var="(.*?)"', test, re.DOTALL)
Upvotes: 10
Reputation: 6360
The problem with your current regex
is that the capture group (.*)
is an extremely greedy statement. After the first instance of a var=
in your string, that capture group will get everything after it.
If you instead decrease the generalization of the expression to var="(\w+)"
, you will not have the same issue, therefore changing that line of python
to:
result = re.findall(r'var="([\w\s]+)"', test)
Upvotes: 1