Kevin Guo
Kevin Guo

Reputation: 97

how to extract number from a sentence through python

Here is a sentence "a building is 100 m tall and 20 m wide" I want to extract the number about height which is 100, so i use

question = input "  "
height = re.findall(r'(\d+) m tall', question)

However, sometimes the sentence is not "100 m tall", it is "100 m high". in this case my program can no longer extract the number i want any more. Is there a way to improve my program and let it work no matter the sentence includes either "tall" or "high".

Upvotes: 1

Views: 114

Answers (3)

Niraj
Niraj

Reputation: 547

As per your requirement, the regular expression should match any of terms 'tall' or 'high'.

         i.e.,  ?:tall|high
        where,  ?: means 'matches any of'
                and,     | means 'or'

So, solution can be like :

>>> re.findall(r'(\d+) m (?:tall|high)', question)


 ['100']

Upvotes: 0

Fomalhaut
Fomalhaut

Reputation: 9745

>>> import re
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m tall and 20 m wide")
['100']
>>> re.findall(r'(\d+) m (?:tall|high)', "a building is 100 m high and 20 m wide")
['100']

Upvotes: 1

alecxe
alecxe

Reputation: 473873

You can check the "tall or high" condition via |:

(\d+) m (tall|high)

Demo:

>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m tall and 20 m wide')
[('100', 'tall')]
>>> re.findall(r'(\d+) m (tall|high)', 'a building is 100 m high and 20 m wide')
[('100', 'high')]

If you want for the word to not be captured, use a non-capturing group:

(\d+) m (?:tall|high)

Upvotes: 4

Related Questions