Reputation: 51
Use regular expressions to find strings of the form:
<count> <longword>
e.g. 101 Dalmations.
More specifically, the match much follow these conditions:
(word1)
is a count consisting of a natural number
(series of one or more digits, with leading and trailing whitespace)(word1, word2)
pair found in the string, if they exist
or None if no such strings were foundFor example:
parse_counted_words('5 watermelons, 13 pineapples, and 1 papaya.') should return ('13', 'pineapples')
parse_counted_words('101 dalmations!') should return ('101', 'dalmations')
parse_counted_words('snow white and the 7 dwarves') should return ('7', 'dwarves')
parse_counted_words('goldilocks and the 3 little pigs') should return None, because 'little' has less than 7 characters
parse_counted_words('678 1234567 890') should return None, because the word following the count does not consist of alphabetic characters
Here is what I wrote:
def parse_counted_words(s):
m=re.findall(r'\s*\d+\s\w{7,}',s)
if len(m)==0:
return None
elif len(m)>1:
return m[1]
else:
m[0].split
Upvotes: 0
Views: 60
Reputation: 89584
You can use this:
s = r'5 watermelons, 13 pineapples, and 1 papaya.'
def parse_counted_words(s):
m=re.findall(r'(?<=\s)\d+\s\w{7,}',s)
if len(m)==0:
return None
else:
return m[-1].split( )
print parse_counted_words(s)
Upvotes: 1