Fill the diagonal of Pandas DataFrame with elements from Pandas Series

Given a pandas Series with an index:

import pandas as pd

s = pd.Series(data=[1,2,3],index=['a','b','c'])

How can a Series be used to fill the diagonal entries of an empty DataFrame in pandas version >= 0.23.0?

The resulting DataFrame would look like:

  a b c
a 1 0 0
b 0 2 0
c 0 0 3

There is a prior similar question which will fill the diagonal with the same value, my question is asking to fill the diagonal with varying values from a Series.

Thank you in advance for your consideration and response.

Upvotes: 11

Views: 12106

Answers (2)

Engineero
Engineero

Reputation: 12908

I'm not sure about directly doing it with Pandas, but you can do this easily enough if you don't mind using numpy.diag() to build the diagonal data matrix for your series and then plugging that into a DataFrame:

diag_data = np.diag(s)  # don't need s.as_matrix(), turns out
df = pd.DataFrame(diag_data, index=s.index, columns=s.index)

   a  b  c
a  1  0  0
b  0  2  0
c  0  0  3

In one line:

df = pd.DataFrame(np.diag(s),
                  index=s.index,
                  columns=s.index)

Timing comparison with a Series made from a random array of 10000 elements:

s = pd.Series(np.random.rand(10000), index=np.arange(10000))

df = pd.DataFrame(np.diag(s), ...)
173 ms ± 2.91 ms per loop (mean ± std. dev. of 7 runs, 20 loops each)

df = pd.DataFrame(0, ...)
np.fill_diagonal(df.values, s)
212 ms ± 909 µs per loop (mean ± std. dev. of 7 runs, 20 loops each)

mat = np.zeros(...)
np.fill_diagonal(mat, s)
df = pd.DataFrame(mat, ...)
175 ms ± 3.72 ms per loop (mean ± std. dev. of 7 runs, 20 loops each)

It looks like the first and third option shown here are essentially the same, while the middle option is the slowest.

Upvotes: 6

jezrael
jezrael

Reputation: 862581

First create DataFrame and then numpy.fill_diagonal:

import numpy as np

s = pd.Series(data=[1,2,3],index=['a','b','c'])

df = pd.DataFrame(0, index=s.index, columns=s.index, dtype=s.dtype)

np.fill_diagonal(df.values, s)
print (df)
   a  b  c
a  1  0  0
b  0  2  0
c  0  0  3

Another solution is create empty 2d array, add values to diagonal and last use DataFrame constructor:

arr = np.zeros((len(s), len(s)), dtype=s.dtype)
np.fill_diagonal(arr, s)

print (arr)
[[1 0 0]
 [0 2 0]
 [0 0 3]]

df = pd.DataFrame(arr, index=s.index, columns=s.index)
print (df)
   a  b  c
a  1  0  0
b  0  2  0
c  0  0  3

Upvotes: 17

Related Questions