Reputation: 39457
I have these two arrays/matrices which represent the joint distribution of 2 discrete random variables X and Y. I represented them in this format because I wanted to use the numpy.cov
function and that seems to be the format cov
requires.
https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.cov.html
joint_distibution_X_Y = [
[0.01, 0.02, 0.03, 0.04,
0.01, 0.02, 0.03, 0.04,
0.01, 0.02, 0.03, 0.04,
0.01, 0.02, 0.03, 0.04],
[0.002, 0.002, 0.002, 0.002,
0.004, 0.004, 0.004, 0.004,
0.006, 0.006, 0.006, 0.006,
0.008, 0.008, 0.008, 0.008],
]
join_probability_X_Y = [
0.01, 0.02, 0.04, 0.04,
0.03, 0.24, 0.15, 0.06,
0.04, 0.10, 0.08, 0.08,
0.02, 0.04, 0.03, 0.02
]
How do I calculate the marginal distribution of X (and also of Y) from the so given joint distribution of X and Y? I mean... is there any library method which I can call?
I want to get as a result e.g. something like:
X_values = [0.002, 0.004, 0.006, 0.008]
X_weights = [0.110, 0.480, 0.300, 0.110]
I want to avoid coding the calculation of the marginal distribution myself.
I assume there's already some Python library method for that.
What is it and how can I call it given the data I have?
Upvotes: 1
Views: 11346
Reputation: 61910
You could use margins:
import numpy as np
from scipy.stats.contingency import margins
join_probability_X_Y = np.array([
[0.01, 0.02, 0.04, 0.04],
[0.03, 0.24, 0.15, 0.06],
[0.04, 0.10, 0.08, 0.08],
[0.02, 0.04, 0.03, 0.02]
])
x, y = margins(join_probability_X_Y)
print(x.T)
Output
[[0.11 0.48 0.3 0.11]]
Upvotes: 6