Nonancourt
Nonancourt

Reputation: 559

Python: Find a string between two strings, repeatedly

I'm new to Python and still learning about regular expressions, so this question may sound trivial to some regex expert, but here you go. I suppose my question is a generalization of this question about finding a string between two strings. I wonder: what if this pattern (initial_substring + substring_to_find + end_substring) is repeated many times in a long string? For example

test='someth1 var="this" someth2 var="that" '
result= re.search('var=(.*) ', test)
print result.group(1)
>>> "this" someth2 var="that"

Instead, I'd like to get a list like ["this","that"]. How can I do it?

Upvotes: 6

Views: 17024

Answers (2)

zwer
zwer

Reputation: 25779

Use re.findall():

result = re.findall(r'var="(.*?)"', test)
print(result)  # ['this', 'that']

If the test string contains multiple lines, use the re.DOTALL flag.

re.findall(r'var="(.*?)"', test, re.DOTALL)

Upvotes: 10

m_callens
m_callens

Reputation: 6360

The problem with your current regex is that the capture group (.*) is an extremely greedy statement. After the first instance of a var= in your string, that capture group will get everything after it.

If you instead decrease the generalization of the expression to var="(\w+)", you will not have the same issue, therefore changing that line of python to:

result = re.findall(r'var="([\w\s]+)"', test)

Upvotes: 1

Related Questions