Reputation:

Pandas: add a column to a multiindex column dataframe

I would like to add a column to the second level of a multiindex column dataframe.

In [151]: df
Out[151]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813 
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

The usual trick of direct assignment does not work:

In [152]: df['bar']['three'] = [0, 1, 2]

In [153]: df
Out[153]: 
first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

How can I add the third row to under "bar"?

Upvotes: 112

Answers (4)

Ynjxsjmh

Reputation: 30032

If you want to add multiple columns to a multiindex column dataframe, you can try

All same value for columns

df[[("foo", "bar1"), ("foo", "bar2")]] = 2

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    2
1  0.267913  1.918923  0.132791  0.178503    2    2
2  1.550526 -0.312235 -1.177689 -0.081596    2    2

Same value for each column

df[[("foo", "bar1"), ("foo", "bar2")]] = [2, 3]

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    2    3
1  0.267913  1.918923  0.132791  0.178503    2    3
2  1.550526 -0.312235 -1.177689 -0.081596    2    3

Same value for each row

df[[("foo", "bar1"), ("foo", "bar2")]] = np.array([range(len(df)) for _ in range(2)]).T

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    0    0
1  0.267913  1.918923  0.132791  0.178503    1    1
2  1.550526 -0.312235 -1.177689 -0.081596    2    2

Different value for each cell

df[[("foo", "bar1"), ("foo", "bar2")]] = [[1, 2],
                                          [3, 4],
                                          [5, 6]] # shape is (3, 2) where 3 is index length and 2 is new added column length

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6

Another usecase is that we have a single index dataframe, and we want to concat it to the multi index dataframe

        bar                 baz
       one       two       one       two     concat to      bar1  bar2
0  0.487880 -0.487661 -1.030176  0.100813   <---------  0     1     2
1  0.267913  1.918923  0.132791  0.178503               1     3     4
2  1.550526 -0.312235 -1.177689 -0.081596               2     5     6

Generate a list of tuples for columns

df[[("foo", col) for col in single_index_df.columns]] = single_index_df

        bar                 baz            foo
        one       two       one       two bar1 bar2
0  0.487880 -0.487661 -1.030176  0.100813    1    2
1  0.267913  1.918923  0.132791  0.178503    3    4
2  1.550526 -0.312235 -1.177689 -0.081596    5    6

Create a new multi index columns dataframe from the single index dataframe as Option 2 of MaxU - stop genocide of UA

df = df.join(pd.DataFrame(single_index_df.values,
                          columns=pd.MultiIndex.from_product([['foo'], single_index_df.columns]),
                          index=single_index_df.index))

Create a multi index dataframe from single index dataframe with pd.concat({'foo': single_index_df}, axis=1)

df = pd.concat([df, pd.concat({'foo': single_index_df}, axis=1)], axis=1)
# or
df = df.join(pd.concat({'foo': single_index_df}, axis=1))

Upvotes: 5

MaxU - stand with Ukraine

Reputation: 210842

If we want to add a multi-level column:

Source DF:

In [221]: df
Out[221]:
first        bar                 baz
second       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740
B       0.488875  0.428836  1.413451 -0.683677
C      -0.243064 -0.069446 -0.911166  0.478370

Option 1: adding result of division: bar / baz as a new foo column

In [222]: df = df.join(
     ...:     df[['bar']].div(df['baz']).rename(columns={'bar':'foo'}))

In [223]: df
Out[223]:
first        bar                 baz                 foo
second       one       two       one       two       one       two
A      -1.089798  2.053026  0.470218  1.440740 -2.317647  1.424980
B       0.488875  0.428836  1.413451 -0.683677  0.345873 -0.627250
C      -0.243064 -0.069446 -0.911166  0.478370  0.266761 -0.145172

Option 2: adding multi-level column with three "sub-columns":

In [235]: df = df.join(pd.DataFrame(
     ...:     np.random.rand(3,3),
     ...:     columns=pd.MultiIndex.from_product([['new'], ['one','two','three']]),
     ...:     index=df.index))

In [236]: df
Out[236]:
first        bar                 baz                 new
second       one       two       one       two       one       two     three
A      -1.089798  2.053026  0.470218  1.440740  0.274291  0.636257  0.091048
B       0.488875  0.428836  1.413451 -0.683677  0.668157  0.456931  0.227568
C      -0.243064 -0.069446 -0.911166  0.478370  0.333824  0.363060  0.949672

Upvotes: 28

dopexxx

Reputation: 2636

If you want to insert (instead of append at the end of the DF) do this:

df.insert(0, ('bar', 'three'), [0, 1, 2])

The second item has to be hashable, so a list will not work.

Upvotes: 1

spencerlyon2

Reputation: 9676

It's actually pretty simple (FWIW, I originally thought to do it your way):

df['bar', 'three'] = [0, 1, 2]
df = df.sort_index(axis=1)
print(df)

        bar                        baz          
        one       two  three       one       two
A -0.212901  0.503615      0 -1.660945  0.446778
B -0.803926 -0.417570      1 -0.336827  0.989343
C  3.400885 -0.214245      2  0.895745  1.011671

Upvotes: 127

Pandas: add a column to a multiindex column dataframe

Answers (4)

Related Questions