Brando
Brando

Reputation: 25

trying to print to a text file with words that only have two or more occurring vowels

import re
twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I)
nonword=re.compile(r"\W+", re.U)
text_file = open("twoVoweledWordList.txt", "w")
file = open("FirstMondayArticle.html","r")
for line in file:
    for word in nonword.split(line):
        if twovowels.match(word): print word
        text_file.write('\n' + word)
text_file.close()

file.close()

This is my python code, I am trying to print only the words that have two or more occurring vowels. When i run this code, it prints everything, including the words and numbers that do not have vowels, to my text file. But the python shell shows me all of the words that have two or more occurring vowels. So how do I change that?

Upvotes: 2

Views: 100

Answers (3)

Padraic Cunningham
Padraic Cunningham

Reputation: 180471

You can remove the vowels with str.translate and compare lengths. If after removing the letters the length difference is > 1 you have at least two vowels:

with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    for line in file:
        for word in line.split():
            if len(word) - len(word.lower().translate(None,"aeiou")) > 1:
                out.write("{}\n".format(word.rstrip()))

In your own code you always write the word as text_file.write('\n' + word) is outside the if block. a good lesson in why you should not have multiple statements on one line, your code is equivalent to:

   if twovowels.match(word):
        print(word)
    text_file.write('\n' + word) # <- outside the if

Your code with the if in the correct location, some changes to your naming convention, adding some spaces between assignments and using with which closes your files for you:

import re
with open("FirstMondayArticle.html") as f, open("twoVoweledWordList.txt", "w") as out:
    two_vowels = re.compile(r".*[aeiou].*[aeiou].*", re.I)
    non_word = re.compile(r"\W+", re.U)
    for line in f:
        for word in non_word.split(line):
            if two_vowels.match(word):
                print(word)
                out.write("{}\n".format(word.rstrip()))  

Upvotes: 1

twalberg
twalberg

Reputation: 62439

I would suggest an alternate, and simpler, method, not using re:

def twovowels(word):
    count = 0
    for char in word.lower():
        if char in "aeiou":
            count = count + 1
            if count > 1:
                return True
    return False

with open("FirstMondayArticle.html") as file,
        open("twoVoweledWordList.txt", "w") as text_file:
    for line in file:
        for word in line.split():
            if twovowels(word):
                print word
                text_file.write(word + "\n")

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

Because it is outside of if condition. This is what the code lines should look like:

for line in file:
    for word in nonword.split(line):
        if twovowels.match(word):
            print word
            text_file.write('\n' + word)
text_file.close()

file.close()

Here is a sample program on Tutorialspoint showing the code above is correct.

Upvotes: 0

Related Questions