gbenyu
gbenyu

Reputation: 47

Word frequency counter in file

I am working on an assignment and I have hit a wall. The assignment requires me to count the frequency of words in a text file. I got my code to count the words and put them into a dictionary but cannot put words together if they have different cases. For example I need the output to show {'a':16...} but it outputs this instead {'A':2...'a':14}. Here is my code. Any help would be much appreciated.

file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
    if word not in wordcount:
        wordcount[word]=1
    else:
        wordcount[word]+=1
print(wordcount) 

Upvotes: 1

Views: 276

Answers (4)

LonelyCpp
LonelyCpp

Reputation: 2673

You can use an inbuilt function called Counter for this as an alternative to looping through the list.

example :

from collections import Counter

file = open("phrases.txt","r")
data = file.read().lower().split()  # added lower() will convert everything to lower case
wordcount = dict(Counter(data))
print(wordcount) 

Upvotes: 2

MaJoR
MaJoR

Reputation: 1044

You can convert the words to lowercase, and then count them. So, your code changes to something like this.

file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
    newWord = word.lower()
    if newWord not in wordcount:
        wordcount[newWord]=1
    else:
        wordcount[newWord]+=1
print(wordcount) 

Basically, you will be storing in the dict, where keys are the lower case versions of each word.

Do note, that you will lose "data", if you are doing operations which are case sensitive.

Upvotes: 0

AmilaMGunawardana
AmilaMGunawardana

Reputation: 1830

lower all the words when comparing. for word.lower() in file.read().split():

Upvotes: 0

U13-Forward
U13-Forward

Reputation: 71580

Seems like in the question your saying there is a uppercase and lowercase issue, so why not:

file=open("phrases.txt","r")
wordCount={}
for word in file.read().split():
    if word.lower() not in wordcount:
        wordcount[word.lower()]=1
    else:
        wordcount[word.lower()]+=1
print(wordcount) 

Or:

file=open("phrases.txt","r")
wordCount={}.fromkeys([i.lower() for i in file.read().split()],1)
for word in file.read().split():
    wordcount[word.lower()]+=1
print(wordcount) 

Upvotes: 1

Related Questions