Reputation: 4100
I have string like:
text = "Why do Humans need to eat food? Humans eat food to survive."
I want to capture everything between Human
and food
but only 1st time.
Expected Output
Humans need to eat food
My Regex:
p =r'(\bHumans?\b.*?\bFoods?\b)'
Python Code:
re.findall(p, text, re.I|re.M|re.DOTALL)
The code correctly captures the string between Human and Food but it doesn't stops at 1st capture.
Research:
I have read that to make it non-greedy I need to put ?
but I am unable to figure out where I should keep it to make it non-greedy. All other permutation and combination I tried I can't stopped it at 1st match.
Update
I am writing a lot of regexes to capture various other entities like this and parsing them in one shot and hence I can't change my re.findall
logic.
Upvotes: 1
Views: 75
Reputation: 6590
Try this:
>>> import re
>>> text = "Why do Humans need to eat food? Humans eat food to survive."
>>> re.search(r'Humans.*?food', text).group() # you want the all powerful non-greedy '?' :)
'Humans need to eat food'
Upvotes: 1
Reputation: 18357
For finding the first match only, Toto's answer is best but as you said you need to use findall
only, you can just append .*
at the end of your regex to match remaining text which won't result in any matches further.
(\bHumans?\b.*?\bFoods?\b).*
^^ This eats remaining part of your text due to which there won't be any further matches.
Sample Python codes,
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b).*'
print(re.findall(p, text, re.I|re.M|re.DOTALL))
Prints,
['Humans need to eat food']
Upvotes: 3
Reputation: 91385
Use search
instead of findall
:
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b)'
res = re.search(p, text, re.I|re.M|re.DOTALL)
print(res.groups())
Output:
('Humans need to eat food',)
Or add .*
at the end of the regex:
import re
text = "Why do Humans need to eat food? Humans eat food to survive."
p =r'(\bHumans?\b.*?\bFoods?\b).*'
# here ___^^
res = re.findall(p, text, re.I|re.M|re.DOTALL)
print(res)
Upvotes: 5