DA_PA
DA_PA

Reputation: 231

Entropy and Mutual Information in R

I want to calculate condition mutual information in R and I used package called infotheo.

I used two ways to calculate I(X;Y1,Y2|Z). First is to use the following code,

condinformation(X$industry,cbind(X$ethnicity,X$education),S=X$gender, method="emp")
[1] -1.523344

And as I think the mutual information can be decomposed to two entropies: I(X;Y1,Y2|Z)=H(X|Z)-H(X|Z,Y1,Y2), I used the following codes,

hhh<-condentropy(X$industry, Y=X$gender, method="emp")
hhh1<-condentropy(X$industry,Y=cbind(X$gender,X$ethnicity,X$education))
hhh-hhh1
[1] 0.1483363

I am wondering why these two gave me different results?

Upvotes: 0

Views: 1545

Answers (1)

jan-glx
jan-glx

Reputation: 9506

The two methods are different estimators and thus give different results, just like the following two estimators for the variance of the sum of random variables a and b give different results:

> a <- rnorm(100)
> b <- rnorm(100)
> var(a+b)-(var(a)+var(b))
[1] 0.5219229

Not sure which estimator is better in your case, but I would guess the first one. You could do some simulations from your model to get an better idea.

Upvotes: 2

Related Questions