LondonRob
LondonRob

Reputation: 78723

Diagonalising a Pandas series

I'm doing some matrix algebra using the very lovely pandas library in Python. I'm really enjoying using the Series and Dataframe objects because of the ability to name rows and columns.

But is there a neat way to diagonalise a Series while maintaining row/column names?

Consider this minimum working example:

>>> import pandas as pd
>>> s = pd.Series(randn(5), index=['a', 'b', 'c', 'd', 'e'])
>>> s
a    0.137477
b   -0.606762
c    0.085030
d   -0.571760
e   -0.475104
dtype: float64

Now, I can do:

>>> import numpy as np
>>> np.diag(s)
array([[ 0.13747693,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        , -0.60676226,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.08502993,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        , -0.57176048,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        , -0.47510435]])

But I'd love to find a way of producing a Dataframe that looks like:

          a         b        c        d         e
0  0.137477  0.000000  0.00000  0.00000  0.000000
1  0.000000 -0.606762  0.00000  0.00000  0.000000
2  0.000000  0.000000  0.08503  0.00000  0.000000
3  0.000000  0.000000  0.00000 -0.57176  0.000000
4  0.000000  0.000000  0.00000  0.00000 -0.475104

or perhaps even (which would be even better!):

          a         b        c        d         e
a  0.137477  0.000000  0.00000  0.00000  0.000000
b  0.000000 -0.606762  0.00000  0.00000  0.000000
c  0.000000  0.000000  0.08503  0.00000  0.000000
d  0.000000  0.000000  0.00000 -0.57176  0.000000
e  0.000000  0.000000  0.00000  0.00000 -0.475104

This would be great because then I could do matrix operations like:

>>> S.dot(s)
a    0.018900
c    0.368160
b    0.007230
e    0.326910
d    0.225724
dtype: float64

and retain the names.

Many thanks in advance, as always. Rob

Upvotes: 5

Views: 4302

Answers (1)

Jeff
Jeff

Reputation: 128948

How about this..

In [107]: pd.DataFrame(np.diag(s),index=s.index,columns=s.index)
Out[107]: 
          a         b         c         d         e
a  0.630529  0.000000  0.000000  0.000000  0.000000
b  0.000000  0.360884  0.000000  0.000000  0.000000
c  0.000000  0.000000  0.345719  0.000000  0.000000
d  0.000000  0.000000  0.000000  0.796625  0.000000
e  0.000000  0.000000  0.000000  0.000000 -0.176848

Upvotes: 7

Related Questions