John Zwinck
John Zwinck

Reputation: 249133

Pandas DataFrame from MultiIndex and NumPy structured array (recarray)

First I create a two-level MultiIndex:

import numpy as np
import pandas as pd

ind = pd.MultiIndex.from_product([('X','Y'), ('a','b')])

I can use it like this:

pd.DataFrame(np.zeros((3,4)), columns=ind)

Which gives:

     X         Y     
     a    b    a    b
0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0

But now I'm trying to do this:

dtype = [('Xa','f8'), ('Xb','i4'), ('Ya','f8'), ('Yb','i4')]
pd.DataFrame(np.zeros(3, dtype), columns=ind)

But that gives:

Empty DataFrame
Columns: [(X, a), (X, b), (Y, a), (Y, b)]
Index: []

I expected something like the previous result, with three rows.

Perhaps more generally, what I want to do is to generate a Pandas DataFrame with MultiIndex columns where the columns have distinct types (as in the example, a is float but b is int).

Upvotes: 2

Views: 497

Answers (2)

piRSquared
piRSquared

Reputation: 294238

pd.DataFrame(np.zeros(3, dtype), columns=ind)

Empty DataFrame
Columns: [(X, a), (X, b), (Y, a), (Y, b)]
Index: []

is just showing the textual representation of the dataframe output.

Columns: [(X, a), (X, b), (Y, a), (Y, b)]

is then just the text representation of the index.

if you instead:

df = pd.DataFrame(np.zeros(3, dtype), columns=ind)

print type(df.columns)

<class 'pandas.indexes.multi.MultiIndex'>

You see it is indeed a pd.MultiIndex

That said and out of the way. What I don't understand is why specifying the index in the dataframe constructor removes the values.

A work around is this.

df = pd.DataFrame(np.zeros(3, dtype))

df.columns = ind

print df

     X       Y   
     a  b    a  b
0  0.0  0  0.0  0
1  0.0  0  0.0  0
2  0.0  0  0.0  0

Upvotes: 1

Andy Hayden
Andy Hayden

Reputation: 375445

This looks like a bug, and worth reporting as an issue github.

A workaround is to set the columns manually after construction:

In [11]: df1 = pd.DataFrame(np.zeros(3, dtype))

In [12]: df1.columns = ind

In [13]: df1
Out[13]:
     X       Y
     a  b    a  b
0  0.0  0  0.0  0
1  0.0  0  0.0  0
2  0.0  0  0.0  0

Upvotes: 2

Related Questions