Reputation: 1
I have this string
tw =('BenSasse, well I did teach her the bend-and-snap https://twitter.com/bethanyshondark/status/903301101855928322 QT @bethanyshondark Is Reese channeling @BenSasse https://acculturated.com/reese-witherspoons-daughter-something-many-celebrity-children-lack-work-ethic/ , Twitter for Android')
I need to create a list with all the words that have more than 3 vowels. Please help!
Upvotes: 0
Views: 67
Reputation: 747
I would suggest you start with creating a list of all vowels:
vowels = ['a','e','i','o','u']
Well, a list of letters (Char) is really the same as a string, so I would just do the following:
vowels = "aeiou"
After that I would attempt to split your string into words. Let's try tw.split()
like Joran Beasley suggested. It returns:
['BenSasse,', 'well', 'I', 'did', 'teach', 'her', 'the', 'bend-and-snap', 'https://twitter.com/bethanyshondark/status/903301101855928322', 'QT', '@bethanyshondark', 'Is', 'Reese', 'channeling', '@BenSasse', 'https://acculturated.com/reese-witherspoons-daughter-something-many-celebrity-children-lack-work-ethic/', ',', 'Twitter', 'for', 'Android']
Are you fine with this being your "words"? Notice that each link is a "word". I'm going to assume that this is fine.
Ok, so if we access each word with a for-loop, we can then access each letter with an inner for-loop. But before we start, we need to keep track of all the accepted words with 3 or more vowels, so make a new list: final_list = list()
. Now:
for word in tw.split():
counter=0 # Let's keep track of how many vowels we have in a word
for letter in word:
if letter in vowels:
counter = counter+1
if counter >= 3:
final_list.append(word) # Add the word if 3 or more vowels.
If you now do a print: print(final_list)
you should get:
['BenSasse,', 'bend-and-snap', 'https://twitter.com/bethanyshondark/status/903301101855928322', '@bethanyshondark', 'Reese', 'channeling', '@BenSasse', 'https://acculturated.com/reese-witherspoons-daughter-something-many-celebrity-children-lack-work-ethic/']
Upvotes: 1
Reputation: 106881
You can use re.findall
with the following regex:
import re
re.findall(r'(?:[a-z-]*[aeiou]){3,}[a-z-]*', tw, flags=re.IGNORECASE)
This returns:
['BenSasse', 'bend-and-snap', 'bethanyshondark', 'bethanyshondark', 'Reese', 'channeling', 'BenSasse', 'acculturated', 'reese-witherspoons-daughter-something-many-celebrity-children-lack-work-ethic', 'Android']
Upvotes: 1