Reputation: 9
I've been having trouble with this for loop that is supposed to iterate through and save a list with all the words in the book.
The error I get is: 'int' object is not iterable
.
def create_word_dict ()
word_list = open("mobydick.txt", "r")
all_list = word_list.read()
all_list = all_list.split()
word_list.close()
for index in len(all_list):
all_list[index] = parseString(all_list[index])
return all_list
# Removes punctuation marks from a string
def parseString (st):
s = ""
for ch in st:
if ch.isalpha() or ch.isspace():
s += ch
else:
s += ""
return s #return was outside code block
Upvotes: 0
Views: 2353
Reputation: 56654
You can speed things up a lot by using some of Python's built-in methods:
from string import ascii_lowercase as lower, ascii_uppercase as upper
from string import digits, punctuation
# create a translation table which
# makes the string lowercase and
# replaces all digits and punctuation with spaces
TRANS = str.maketrans(
lower + upper + digits + punctuation,
lower + lower + " " * len(digits + punctuation)
)
def get_word_list(filename):
with open(filename) as inf:
return inf.read().translate(TRANS).split()
words = get_word_list("mobydick.txt")
For sake of comparison, on my machine this loads the words from the Gutenberg version of Moby Dick (220231 words) in 0.11 seconds.
Upvotes: 0
Reputation: 113988
I guess you want
for index in range(len(all_list)):
all_list[index]=parseString(all_list[index])
since for i in 5:
means nothing in python(since an int cannot be iterated), however for i in range(5)
is indeed a valid statement, since a range can be iterated...
however its probably better to just iterate over the objects directly
new_list = []
for word in all_list:
new_list.append(parseString(word))
or even better just do a list comprehension
new_list = [parseString(word) for word in all_list]
Upvotes: 2