Reputation: 47
I am working on an assignment and I have hit a wall. The assignment requires me to count the frequency of words in a text file. I got my code to count the words and put them into a dictionary but cannot put words together if they have different cases. For example I need the output to show {'a':16...}
but it outputs this instead {'A':2...'a':14}
. Here is my code. Any help would be much appreciated.
file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
if word not in wordcount:
wordcount[word]=1
else:
wordcount[word]+=1
print(wordcount)
Upvotes: 1
Views: 276
Reputation: 2673
You can use an inbuilt function called Counter
for this as an alternative to looping through the list.
example :
from collections import Counter
file = open("phrases.txt","r")
data = file.read().lower().split() # added lower() will convert everything to lower case
wordcount = dict(Counter(data))
print(wordcount)
Upvotes: 2
Reputation: 1044
You can convert the words to lowercase, and then count them. So, your code changes to something like this.
file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
newWord = word.lower()
if newWord not in wordcount:
wordcount[newWord]=1
else:
wordcount[newWord]+=1
print(wordcount)
Basically, you will be storing in the dict, where keys are the lower case versions of each word.
Do note, that you will lose "data", if you are doing operations which are case sensitive.
Upvotes: 0
Reputation: 1830
lower all the words when comparing.
for word.lower() in file.read().split():
Upvotes: 0
Reputation: 71580
Seems like in the question your saying there is a uppercase and lowercase issue, so why not:
file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
if word.lower() not in wordcount:
wordcount[word.lower()]=1
else:
wordcount[word.lower()]+=1
print(wordcount)
Or:
file=open("phrases.txt","r")
wordCount={}.fromkeys([i.lower() for i in file.read().split()],1)
for word in file.read().split():
wordcount[word.lower()]+=1
print(wordcount)
Upvotes: 1