Reputation: 131
I am having problems interpreting the results of the mi.plugin()
(or mi.empirical()
) function from the entropy package. As far as I understand, an MI=0 tells you that the two variables that you are comparing are completely independent; and as MI increases, the association between the two variables is increasingly non-random.
Why, then, do I get a value of 0 when running the following in R (using the {entropy}
package):
mi.plugin( rbind( c(1, 2, 3), c(1, 2, 3) ) )
when I'm comparing two vectors that are exactly the same?
I assume my confusion is based on a theoretical misunderstanding on my part, can someone tell me where I've gone wrong?
Thanks in advance.
Upvotes: 13
Views: 18980
Reputation: 3438
the mi.plugin function works on the joint frequency matrix of the two random variables. The joint frequency matrix indicates the number of times for X and Y getting the specific outcomes of x and y. In your example, you would like X to have 3 possible outcomes - x=1, x=2, x=3, and Y should also have 3 possible outcomes, y=1, y=2, y=3. Let's go through your example and calculate the joint frequency matrix:
> X=c(1, 2, 3)
> Y=c(1, 2, 3)
> freqs=matrix(sapply(seq(max(X)*max(Y)), function(x) length(which(((X-1)*max(Y)+Y)==x))),ncol=max(X))
> freqs
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
This matrix shows the number of occurrences of X=x and Y=y. For example there was one observation for which X=1 and Y=1. There were 0 observations for which X=2 and Y=1. You can now use the mi.plugin function:
> mi.plugin(freqs)
[1] 1.098612
Upvotes: 4
Reputation: 171
Use mutinformation(x,y)
from package infotheo.
> mutinformation(c(1, 2, 3), c(1, 2, 3) )
[1] 1.098612
> mutinformation(seq(1:5),seq(1:5))
[1] 1.609438
and normalized mutual information will be 1.
Upvotes: 11