user10626935
user10626935

Reputation:

Finding conditional mutual information from 3 discrete variable

I am trying to find conditional mutual information between three discrete random variable using pyitlib package for python with the help of the formula:

I(X;Y|Z)=H(X|Z)+H(Y|Z)-H(X,Y|Z)

The expected Conditional Mutual information value is= 0.011

My 1st code:

import numpy as np
from pyitlib import discrete_random_variable as drv

X=[0,1,1,0,1,0,1,0,0,1,0,0]
Y=[0,1,1,0,0,0,1,0,0,1,1,0]
Z=[1,0,0,1,1,0,0,1,1,0,0,1]

a=drv.entropy_conditional(X,Z)
##print(a)
b=drv.entropy_conditional(Y,Z)
##print(b)
c=drv.entropy_conditional(X,Y,Z)
##print(c)

p=a+b-c
print(p)

The answer i am getting here is=0.4632245116328402

My 2nd code:

import numpy as np
from pyitlib import discrete_random_variable as drv

X=[0,1,1,0,1,0,1,0,0,1,0,0]
Y=[0,1,1,0,0,0,1,0,0,1,1,0]
Z=[1,0,0,1,1,0,0,1,1,0,0,1]

a=drv.information_mutual_conditional(X,Y,Z)
print(a)

The answer i am getting here is=0.1583445441575102

While the expected result is=0.011

Can anybody help? I am in big trouble right now. Any kind of help will be appreciable. Thanks in advance.

Upvotes: 1

Views: 3989

Answers (3)

Robster
Robster

Reputation: 1

It's already been a long time, but... I also think that the result 0.158344 is correct.

The definition of Conditional Mutual Information is here (10 reputations limit...😂).

Since the probability distribution p is unknown, using frequency instead...we can implement conditional mutual information as follows:

import pandas as pd

def conditional_mutual_information(data, X:set, Y:set, Z:set, delta = 1):
        X = list(X); Y = list(Y); Z = list(Z)
        cmi = 0

        P_Z = data.groupby(Z).size()
        P_Z = P_Z/P_Z.sum()

        P_XZ = data.groupby(X + Z).size()
        P_XZ = P_XZ/P_XZ.sum()

        P_YZ = data.groupby(Y + Z).size()
        P_YZ = P_YZ/P_YZ.sum()

        P_XYZ = data.groupby(X + Y + Z).size()
        P_XYZ = P_XYZ/P_XYZ.sum()

        for ind in P_XYZ.index:
            x_ind = ind[:len(X)]
            y_ind = ind[len(X):len(X + Y)]
            z_ind = ind[len(X + Y):]

            xz_ind = x_ind + z_ind
            yz_ind = y_ind + z_ind
            xyz_ind = ind

            z_ind =  pd.MultiIndex.from_tuples([z_ind], names = Z) if len(Z) != 1 else pd.Index(z_ind, name = Z[0])
            xz_ind = pd.MultiIndex.from_tuples([xz_ind], names = X + Z)
            yz_ind = pd.MultiIndex.from_tuples([yz_ind], names = Y + Z)
            xyz_ind = pd.MultiIndex.from_tuples([xyz_ind], names = X + Y + Z)

            cmi += delta * P_XYZ[xyz_ind].item() * np.log2(P_Z[z_ind].item() * P_XYZ[xyz_ind].item() / (P_XZ[xz_ind].item() * P_YZ[yz_ind].item()))

        return cmi

And apply it to your example...

data = pd.DataFrame()
data['X'] = [0,1,1,0,1,0,1,0,0,1,0,0]
data['Y'] = [0,1,1,0,0,0,1,0,0,1,1,0]
data['Z'] = [1,0,0,1,1,0,0,1,1,0,0,1]
conditional_mutual_information(data, {'X'}, {'Y'}, {'Z'})
[out] >> 0.1583445441575104

The result is same with the second code.

If you want to see how it works, see conditional mutual information in here

Upvotes: 0

gaoshuai wang
gaoshuai wang

Reputation: 16

I think that the library function entropy_conditional(x,y,z) has some errors. I also test my samples, the same problem happens. however, the function entropy_conditional with two variables is ok. So I code my entropy_conditional(x,y,z) as entropy(x,y,z), the results is correct. the code may be not beautiful.

def gen_dict(x):
    dict_z = {}
    for key in x:
        dict_z[key] = dict_z.get(key, 0) + 1
    return dict_z

def entropy(x,y,z):   
    x = np.array([x,y,z]).T
    x = x[x[:,-1].argsort()] # sorted by the last column    
    w = x[:,-3]
    y = x[:,-2]
    z = x[:,-1]
    
    # dict_w = gen_dict(w)
    # dict_y = gen_dict(y)
    dict_z = gen_dict(z)
    list_z = [dict_z[i] for i in set(z)]
    p_z = np.array(list_z)/sum(list_z)
    pos = 0
    ent = 0
    for i in range(len(list_z)):   
        w = x[pos:pos+list_z[i],-3]
        y = x[pos:pos+list_z[i],-2]
        z = x[pos:pos+list_z[i],-1]
        pos += list_z[i]
        list_wy = np.zeros((len(set(w)),len(set(y))), dtype = float , order ="C")
        list_w = list(set(w))
        list_y = list(set(y))
        
        for j in range(len(w)):
            pos_w = list_w.index(w[j])
            pos_y = list_y.index(y[j])
            list_wy[pos_w,pos_y] += 1
            #print(pos_w)
            #print(pos_y)
        list_p = list_wy.flatten()
        list_p = np.array([k for k in list_p if k>0]/sum(list_p))
        ent_t = 0
        for j in list_p:
            ent_t += -j * math.log2(j)
        #print(ent_t)
        ent += p_z[i]* ent_t
    return ent
    

X=[0,1,1,0,1,0,1,0,0,1,0,0]
Y=[0,1,1,0,0,0,1,0,0,1,1,0]
Z=[1,0,0,1,1,0,0,1,1,0,0,1]  

a=drv.entropy_conditional(X,Z)
##print(a)
b=drv.entropy_conditional(Y,Z)         
c = entropy(X, Y, Z)
p=a+b-c
print(p)
0.15834454415751043

Upvotes: 0

user92403
user92403

Reputation: 19

Based on the definitions of conditional entropy, calculating in bits (i.e. base 2) I obtain H(X|Z)=0.784159, H(Y|Z)=0.325011, H(X,Y|Z) = 0.950826. Based on the definition of conditional mutual information you provide above, I obtain I(X;Y|Z)=H(X|Z)+H(Y|Z)-H(X,Y|Z)= 0.158344. Noting that pyitlib uses base 2 by default, drv.information_mutual_conditional(X,Y,Z) appears to be computing the correct result.

Note that your use of drv.entropy_conditional(X,Y,Z) in your first example to compute conditional entropy is incorrect, you can however use drv.entropy_conditional(XY,Z), where XY is a 1D array representing the joint observations about X and Y, for example XY = [2*xy[0] + xy[1] for xy in zip(X,Y)].

Upvotes: 0

Related Questions