Joe Dingle
Joe Dingle

Reputation: 123

Python function to take text file and create dictionary with keys as words and values as frequencies

first off I apologize in advance if anything is hard to understand in my question as I am a beginner in python and quite tired as it is late.

I am trying to figure out why I keep receiving errors while creating this function to take a text file and create a dictionary which contains the words as well as the frequencies and also prints which word has the highest frequency in the file.

Here is my code:

def poet(filename):
    word_frequency = {}
    with open(filename,'r') as f:
        for line in f:
            for word in line.split():
                word = word.replace('.',"")
                word = word.replace(',',"")
                word = word.replace(';',"")
                if word in word_frequency:
                    word_frequency[word] += 1;
                else:
                    word_frequency[word] = 1;
most_freq_word = max(word_frequency, key=word_frequency)
print("The word " + most_freq_word + " is in text ")
str(word_frequency[most_freq_word]) + " times"
print(word_frequency)


poet('Poem.txt')

And here is the error I'm receiving:

Traceback (most recent call last):
  File "C:/Users/Noah/Desktop/Python/3.py", line 20, in <module>
    str(word_frequency[most_freq_word]) + " times"
NameError: name 'word_frequency' is not defined

Also, if anything is unclear please comment and I will respond right away, thank you in advance.

Edit:

Thank you for the responses, I have implemented this into my code but I am now receiving this error:

Traceback (most recent call last):
  File "C:/Users/Noah/Desktop/Python/3.py", line 20, in <module>
    poet('FrostPoem.txt')
  File "C:/Users/Noah/Desktop/Python/3.py", line 14, in poet
    most_freq_word = max(word_frequency, key=word_frequency)
TypeError: 'dict' object is not callable

The new code is:

def poet(filename):
    word_frequency = {}
    with open(filename,'r') as f:
        for line in f:
            for word in line.split():
                word = word.replace('.',"")
                word = word.replace(',',"")
                word = word.replace(';',"")
                if word in word_frequency:
                    word_frequency[word] += 1;
                else:
                    word_frequency[word] = 1;

    most_freq_word = max(word_frequency, key=word_frequency)
    print("The word " + most_freq_word + " is in text " + \
    str(word_frequency[most_freq_word]) + " times")
    print(word_frequency)


poet('Poem.txt')

Upvotes: 2

Views: 2449

Answers (5)

Kenly
Kenly

Reputation: 26768

When you do this str(word_frequency[most_freq_word]) + " times" in python, it suppose that word_frequency was declared before.In you case word_frequencyis declared in poet function.

Check if there is an indentation problem.

You need dictionary keys.
To solve it, use key=word_frequency.get

Upvotes: 0

El&#39;endia Starman
El&#39;endia Starman

Reputation: 2244

Ah-ha, there's your problem: several of your lines should be inside the function, like so:

def poet(filename):
    word_frequency = {}
    with open(filename,'r') as f:
        for line in f:
            for word in line.split():
                word = word.replace('.',"")
                word = word.replace(',',"")
                word = word.replace(';',"")
                if word in word_frequency:
                    word_frequency[word] += 1;
                else:
                    word_frequency[word] = 1;

    most_freq_word = max(word_frequency, key=word_frequency)
    print("The word " + most_freq_word + " is in text " + \
    str(word_frequency[most_freq_word]) + " times")
    print(word_frequency)


poet('Poem.txt')

Now, you might want this function to be more reusable, like if you didn't want to print immediately but wanted to do something further with word_frequency. In that case, you would need a return statement and your code might look like this:

def poet(filename):
    word_frequency = {}
    with open(filename,'r') as f:
        for line in f:
            for word in line.split():
                word = word.replace('.',"")
                word = word.replace(',',"")
                word = word.replace(';',"")
                if word in word_frequency:
                    word_frequency[word] += 1;
                else:
                    word_frequency[word] = 1;

    return word_frequency

word_freq = poet('Poem.txt')
most_freq_word = max(word_freq, key=word_freq)
print("The word " + most_freq_word + " is in text " + \
str(word_freq[most_freq_word]) + " times")
print(word_freq)

In response to your edit, replace this line

    most_freq_word = max(word_frequency, key=word_frequency)

with this line

    most_freq_word = max(word_frequency, key=lambda x:word_frequency[x])

This gets the maximum based on the values of the keys.

Upvotes: 1

rmarques
rmarques

Reputation: 91

word_frequency is only defined in the scope of poet function. To access it outside of the function you should return it

word_frequency = poet('Poem.txt')
most_freq_word = max(word_frequency, key=word_frequency)
print("The word " + most_freq_word + " is in text ")
str(word_frequency[most_freq_word]) + " times"
print(word_frequency)

Also there are better solutions for your problem. You can check the collections.Counter. The example does exactly what you want

Upvotes: 0

Netwave
Netwave

Reputation: 42796

You can use Counter as follows:

from collections import Counter

def poet(filename):
    with open(filename, "r") as f:
        counter = Counter(f.read().split())
    return counter

In case you want to strip ',' or ';' for example just stript it before or map over the list to remove them.

Upvotes: 0

k4ppa
k4ppa

Reputation: 4667

You define word_frequency inside the function poet(), so the scope is local, but you use the dictionary outside, and this gives the error.

def poet(filename):
    word_frequency = {}
    with open(filename,'r') as f:
    for line in f:
        for word in line.split():
            word = word.replace('.',"")
            word = word.replace(',',"")
            word = word.replace(';',"")
            if word in word_frequency:
                word_frequency[word] += 1;
            else:
                word_frequency[word] = 1;
    most_freq_word = max(word_frequency, key=word_frequency)
    print("The word " + most_freq_word + " is in text ")
    str(word_frequency[most_freq_word]) + " times"
    print(word_frequency)

poet('Poem.txt')

Put all the instruction inside the function and it should work.

Upvotes: 0

Related Questions