user3238319
user3238319

Reputation: 51

use regular expressions to parse counted words

Use regular expressions to find strings of the form:

<count> <longword>

e.g. 101 Dalmations.

More specifically, the match much follow these conditions:

For example:

parse_counted_words('5 watermelons, 13 pineapples, and 1 papaya.') should return ('13', 'pineapples')
parse_counted_words('101 dalmations!') should return ('101', 'dalmations')
parse_counted_words('snow white and the 7 dwarves') should return ('7', 'dwarves')
parse_counted_words('goldilocks and the 3 little pigs') should return None, because 'little' has less than 7 characters
parse_counted_words('678 1234567 890')  should return None, because the word following the count does not consist of alphabetic characters

Here is what I wrote:

def parse_counted_words(s):
    m=re.findall(r'\s*\d+\s\w{7,}',s)
    if len(m)==0:
        return None
    elif len(m)>1:
        return m[1]
    else:
        m[0].split

Upvotes: 0

Views: 60

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89584

You can use this:

s = r'5 watermelons, 13 pineapples, and 1 papaya.'
def parse_counted_words(s):
    m=re.findall(r'(?<=\s)\d+\s\w{7,}',s)
    if len(m)==0:
        return None
    else:
        return m[-1].split( )


print parse_counted_words(s)

Upvotes: 1

Related Questions