multiindex dataframe to pivot table with a new column

Question

I have a data frame with multindex, i want to convert it to a pivot table, do summarize on the columns, the data are:

import random
import pandas as pd
arrays = [[2,2,3,3,3,4,4,4,4,5,5,7,7],
      [1,2,1,2,3,1,2,3,4,1,3,1,4]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names = ['first','second'])
data = pd.Series(random.sample(range(1,100),13), index = index)
data

first  second
2      1         28
       2         20
3      1          7
       2          6
       3         86
4      1         10
       2         30
       3          8
       4         44
5      1         74
       3         65
7      1         12
       4         72
dtype: int64

I want to convert it to (inner value is sum of the column values):

      second==1    second > 1
first
2      28          20
3      7           92
4      10          38
5      74          65
7      1           72

Is there an elegant way of doing this?

Thanks!

su79eu7k · Accepted Answer

Modify your specific level index and restructure the data using groupby. set_levels() and get_level_values() are useful when you modify your specific level index of pandas multiIndex.

data.index = data.index.set_levels(data.index.get_level_values(1).map(lambda x: 'second = 1' if x == 1 else 'second > 1'), level=1)
print data.unstack().fillna(0).groupby(axis=1, level=0).sum()

second  second = 1  second > 1
first                         
2             44.0        46.0
3            110.0        31.0
4             63.0       150.0
5             74.0         0.0
7              7.0        86.0

multiindex dataframe to pivot table with a new column

Answers (2)

Setup

Solution

Related Questions