Eddie14
Eddie14

Reputation: 19

Frequency Distribution of Bigrams

I have done the following

import nltk


words = nltk.corpus.brown.words()
freq = nltk.FreqDist(words)

And am able to find the frequency of certain words in the brown corpus, like

freq["the"]
62713

But now I want to be able to find the Frequency Distribution of specific bigrams. So then I tried

bigrams = nltk.bigrams(words)
freqbig = nltk.FreqDist(bigrams)

But every bigram that I enter, I always get 0. Like,

freqbig["the man"]
0

What I am doing wrong?

Upvotes: 1

Views: 720

Answers (1)

nikeros
nikeros

Reputation: 3379

It accepts a tuple as key, not a str:

freqbig[("the", "man")]

OUTPUT

128

If you want to pass strings, you could create an auxiliary function which takes care of it:

def get_frequency(my_string):
    return freqbig[tuple(my_string.split(" "))]

Upvotes: 1

Related Questions