Marco Di Gennaro
Marco Di Gennaro

Reputation: 415

Python: concatenate pandas multiindex

I need to generate a pd.DataFrame with columns being composed by a list and a Multiindex object, and I need to do it before filling the final dataframe with data.

Say the columns are ['one', 'two'] and the multiindex obtained from from_product:

import pandas as pd

col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

I would like to get a list of columns which looks like this:

'one' | 'two' | ('day','a') | ('day','b') | ('month','a') | ('month','b')

One possible solution would be to use two different and separate Multiindex, one with a dummy column, both generate by from_product


col_11 = ['one', 'two']
col_12 = ['']
col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_1 =  pd.MultiIndex.from_product([ col_11, col_12 ])
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

How could I get to this?

(one, '') | (two, '') | ('day','a') | ('day','b') | ('month','a') |  ('month','b')

I have tried several trivial solutions, but each gave me a different error or a wrong result

mult_1+mult_2 #TypeError: cannot perform __add__ with this index type: MultiIndex
pd.merge #TypeError: Can only merge Series or DataFrame objects, a <class 'list'> was passed
pd.MultiIndex.from_arrays([ mult_1, mult_2 ]) #NotImplementedError: isna is not defined for MultiIndex

Thank you for your advices

Upvotes: 2

Views: 2966

Answers (1)

jezrael
jezrael

Reputation: 862521

If possible, the best is create MultiIndex in index by columns one, two and then MultiIndex in columns by pairs - so not mixed non multiindex with multindex values:

col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

one = range(5)
two = list('ABCDE')
mult_3 =  pd.MultiIndex.from_arrays([ one, two], names=['one','two'])

df = pd.DataFrame(0, columns=mult_2, index=mult_3)
print (df)
        day    month   
          a  b     a  b
one two                
0   A     0  0     0  0
1   B     0  0     0  0
2   C     0  0     0  0
3   D     0  0     0  0
4   E     0  0     0  0

Use Index.append:

print (mult_1.append(mult_2))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )

Or Index.union with sort=False:

print (mult_1.union(mult_2, sort=False))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )

Upvotes: 1

Related Questions