Reputation: 101
Read all the lines from the file and split the lines into words using the split() method. Further, remove punctuation from the ends of words using the strip("""!"#$%&'()*,-./:;?@[]_""")
method call
I am very beginner in python and trying to solve some basic problems, I have used split and strip function in the problem given but I am getting the error in frequencies of some words, please review my code.
Python Code:
def word_frequencies(filename="alice.txt"):
with open(filename) as f:
string=f.read()
words=string.split()
l=[]
for word in words:
temp=word.strip("""!"#$%&'()*,-./:;?@[]""")
if temp:
l.append(temp)
string2=[]
for i in l:
if i not in string2:
string2.append(i)
for j in string2:
print(f"{j}\t{l.count(j)}")
Output:
The 64
Project 83
Gutenberg 27
EBook 3
of 303
. . . and so on.
But the actual output is:
The 64
Project 83
Gutenberg 26
EBook 3
of 303
. . . and so on
Upvotes: 0
Views: 85
Reputation: 126
Use re.findall to split up to words:
from re import findall
words = findall(r"\b\w+\b", text)
, where text is your f.read().
Then count them:
from collections import Counter
c = Counter(words)
Check word count:
for word in ("The", "Project", "Gutenberg", "EBook",):
print(word, c[word])
Prints:
The 64
Project 83
Gutenberg 83
EBook 3
Upvotes: 1