Reputation: 97
Here is a sentence "a building is 100 m tall and 20 m wide" I want to extract the number about height which is 100, so i use
question = input " "
height = re.findall(r'(\d+) m tall', question)
However, sometimes the sentence is not "100 m tall", it is "100 m high". in this case my program can no longer extract the number i want any more. Is there a way to improve my program and let it work no matter the sentence includes either "tall" or "high".
Upvotes: 1
Views: 114
Reputation: 547
As per your requirement, the regular expression should match any of terms 'tall' or 'high'.
i.e., ?:tall|high
where, ?: means 'matches any of'
and, | means 'or'
So, solution can be like :
>>> re.findall(r'(\d+) m (?:tall|high)', question)
['100']
Upvotes: 0
Reputation: 9745
>>> import re
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m tall and 20 m wide")
['100']
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m high and 20 m wide")
['100']
Upvotes: 1
Reputation: 473873
You can check the "tall or high" condition via |
:
(\d+) m (tall|high)
Demo:
>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m tall and 20 m wide')
[('100', 'tall')]
>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m high and 20 m wide')
[('100', 'high')]
If you want for the word to not be captured, use a non-capturing group:
(\d+) m (?:tall|high)
Upvotes: 4