Temp_coder
Temp_coder

Reputation: 287

Reduce multi-level columns into single level by deleting certain levels in a dataframe

I have a dataframe like below. data =

SAMPLE   MATERIAL   MAT_WEIGHT
SAM1     MAT1       15
SAM1     MAT3       35
SAM1     MAT1       45
SAM2     MAT2       55
SAM2     MAT2       05

The real dataframe is much longer.

I applied pivot table like below.

table = (pd.pivot_table(data, index = ['SAMPLE'], 
columns = ['MATERIAL'], values = ['MAT_WEIGHT'], 
aggfunc = {'MAT_WEIGHT':np.sum})).fillna(0)

and got the table like below. table =

       MAT_WEIGHT

MATERIAL    MAT1    MAT2    MAT3    MAT4                                                                            

SAMPLE

SAM1        100     50      80     30
SAM2        80      50      120    25
SAM3        78      58      65     25

I want to reduce the table to a single level column from multiple levels. The final dataframe that I want looks like below.

table_final =

SAMPLE     MAT1     MAT2    MAT3    MAT4                                                                            
SAM1        100     50      80     30
SAM2        80      50      120    25
SAM3        78      58      65     25

How to do this? (The numerical values given here are for visualization purpose only and not exact.)

Upvotes: 1

Views: 513

Answers (2)

Sunday Ikpe
Sunday Ikpe

Reputation: 974

This would solve it:

table_final = table.droplevel(0, axis=1)

With your final table looking like:

           MAT1     MAT2    MAT3    MAT4    
SAMPLE                                                                                
SAM1        100     50      80     30
SAM2        80      50      120    25
SAM3        78      58      65     25

Upvotes: 0

jezrael
jezrael

Reputation: 862731

Avoid using column names under lists to avoid MultiIndex from DataFrame.pivot_table, also add fill_value parameter:

table = pd.pivot_table(data, 
                       index ='SAMPLE', 
                       columns = 'MATERIAL', 
                       values = 'MAT_WEIGHT', 
                       aggfunc = np.sum,
                       fill_value=0)

print (table)
MATERIAL  MAT1  MAT2  MAT3
SAMPLE                    
SAM1        60     0    35
SAM2         0    60     0

Last for index to column and remove column name is possible use DataFrame.reset_index and DataFrame.rename_axis:

table = table.reset_index().rename_axis(None, axis=1)

Upvotes: 3

Related Questions