user3016928
user3016928

Reputation: 11

trying to create a dictionary from a text file but

so, I have text file (a paragraph) and I need to read the file and create a dictionary containing each different word from the file as a key and the corresponding value for each key will be an integer showing the frequency of the word in the text file. an example of what the dictionary should look like:

{'and':2, 'all':1, 'be':1, 'is':3} etc.

so far I have this,

def create_word_frequency_dictionary () :
filename = 'dictionary.txt'
infile = open(filename, 'r') 
line = infile.readline()

my_dictionary = {}
frequency = 0

while line != '' :
    row = line.lower()
    word_list = row.split()
    print(word_list)
    print (word_list[0])
    words = word_list[0]
    my_dictionary[words] = frequency+1
    line = infile.readline()

infile.close()

print (my_dictionary)

create_word_frequency_dictionary()

any help would be appreciated thanks.

Upvotes: 0

Views: 118

Answers (3)

vaultah
vaultah

Reputation: 46513

Documentation defines collections module as "High-performance container datatypes". Consider using collections.Counter instead of re-inventing the wheel.

from collections import Counter
filename = 'dictionary.txt'
infile = open(filename, 'r') 
text = str(infile.read())
print(Counter(text.split()))

Update: Okay, I fixed your code and now it works, but Counter is still a better option:

def create_word_frequency_dictionary () :
    filename = 'dictionary.txt'
    infile = open(filename, 'r') 
    lines = infile.readlines()

    my_dictionary = {}

    for line in lines:
        row = str(line.lower())
        for word in row.split():
            if word in my_dictionary:
                 my_dictionary[word] = my_dictionary[word] + 1
            else:
                 my_dictionary[word] = 1

    infile.close()
    print (my_dictionary)

create_word_frequency_dictionary()

Upvotes: 3

Ashwinee K Jha
Ashwinee K Jha

Reputation: 9307

If you are not using version of python which has Counter:

>>> import collections
>>> words = ["a", "b", "a", "c"]
>>> word_frequency = collections.defaultdict(int)
>>> for w in words:
...   word_frequency[w] += 1
... 
>>> print word_frequency
defaultdict(<type 'int'>, {'a': 2, 'c': 1, 'b': 1})

Upvotes: 1

Mehraban
Mehraban

Reputation: 3324

Just replace my_dictionary[words] = frequency+1 with my_dictionary[words] = my_dictionary[words]+1.

Upvotes: 0

Related Questions