Reputation: 9
Create a matrix like a transtision matrix How i can create random matrix with sum of values by column = 1 in python ?
Upvotes: 0
Views: 1065
Reputation: 20080
You could use KNOWN distribution where each sample would have (by default) summed to one, e.g. Dirichlet distribution.
After that code is basically one liner, Python 3.8, Windows 10 x64
import numpy as np
N = 3
# set alphas array, 1s by default
a = np.empty(N)
a.fill(1.0)
mtx = np.random.dirichlet(a, N).transpose()
print(mtx)
and it will print something like
[[0.56634637 0.04568052 0.79105779]
[0.42542107 0.81892862 0.02465906]
[0.00823256 0.13539087 0.18428315]]
UPDATE
For the case of "sample something and normalize", problem is one would get value from unknown distribution. For Dirichlet there are expressions for mean, std.dev, PDF, CDF, you name it.
Even for the case with Xi sampled from U(0,1) what would be distribution of values for Xi/Sum(i, Xi).
Anything to say about mean? std.dev? PDF? Other stat properties?
You could sample from exponential and get sum normalized to 1, but question would be even more acute - if Xi is Exp(1), what is the distribution for Xi/Sum(i, Xi) ? PDF? Mean? Std.dev?
Upvotes: 1
Reputation: 46
(EDIT: added output)
I suggest completing this in two steps:
Create a random matrix
Normalize each column
1. Create random matrix
Let's say you want a 3 by 3 random transition matrix:
M = np.random.rand(3, 3)
Each of M
's entries will have a random value between 0 and 1.
Normalize M
's columns
By dividing each column by the column sum will achieve what you want. This can be done in several ways, but I prefer to create an array r
whose elements is the column sum of M
:
r = M.sum(axis=0)
Then, divide M
by r
:
transition_matrix = M / r
Example output
>>> import numpy as np
>>> M = np.random.rand(3,3 )
>>> r = M.sum(axis=0)
>>> transition_matrix = M / r
>>> M
array([[0.74145687, 0.68389986, 0.37008102],
[0.81869654, 0.0394523 , 0.94880781],
[0.93057194, 0.48279246, 0.15581823]])
>>> r
array([2.49072535, 1.20614462, 1.47470706])
>>> transition_matrix
array([[0.29768713, 0.56701315, 0.25095223],
[0.32869804, 0.03270943, 0.64338731],
[0.37361483, 0.40027743, 0.10566046]])
>>> transition_matrix.sum(axis=0)
array([1., 1., 1.])
Upvotes: 3