Reputation: 415
Is there a quickest way to go from the following three lists to a covariance matrix in Python (numpy array)?
Fac2 Fac1 VarCovar
a a 1.4
a b 0.7
a c 0.3
b a 0.7
b b 1.8
b c 6.3
c a 0.3
c b 6.3
c c 2.4
Upvotes: 3
Views: 1251
Reputation: 176810
You can create the 3x3 matrix easily using Pandas. Create a DataFrame df
from the above array and pivot on the third column using pivot_table
.
For example if you have the following dictionary d
of lists:
{'Fac1': ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'],
'Fac2': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
'VarCovar': [1.4, 0.7, 0.3, 0.7, 1.8, 6.3, 0.3, 6.3, 2.4]}
Create the DataFrame like this:
df = pd.DataFrame(d)
And then:
>>> df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar')
Fac2 a b c
Fac1
a 1.4 0.7 0.3
b 0.7 1.8 6.3
c 0.3 6.3 2.4
Using the values
attribute on the end returns a NumPy array from the table:
>>> df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar').values
array([[ 1.4, 0.7, 0.3],
[ 0.7, 1.8, 6.3],
[ 0.3, 6.3, 2.4]])
If you don't have all pairs, you can proceed in the same way and fill in the missing values with the transposed index pair:
>>> d = {'Fac1': ['a', 'b', 'c' , 'b', 'c', 'c'],
'Fac2': ['a', 'a', 'a' , 'b', 'b', 'c'],
'VarCovar': [1.4, 0.7, 0.3, 1.8, 6.3, 2.4]}
>>> df = pd.DataFrame(d)
>>> table = df.pivot_table(rows='Fac1', cols='Fac2', values='VarCovar')
>>> table.combine_first(table.T)
Fac2 a b c
Fac1
a 1.4 0.7 0.3
b 0.7 1.8 6.3
c 0.3 6.3 2.4
(I took the idea of using combine_first
from DSM's answer here)
Upvotes: 5