Peaceful
Peaceful

Reputation: 5450

Use groupby keys as indexes of pandas dataframe

I have a following pandas dataframe df:

                    % Renewable  Energy Supply
Country                                       
China                 19.754910   1.271910e+11
United States         11.570980   9.083800e+10
Japan                 10.232820   1.898400e+10
United Kingdom        10.600470   7.920000e+09
Russian Federation    17.288680   3.070900e+10
Canada                61.945430   1.043100e+10
Germany               17.901530   1.326100e+10
India                 14.969080   3.319500e+10
France                17.020280   1.059700e+10
South Korea            2.279353   1.100700e+10
Italy                 33.667230   6.530000e+09
Spain                 37.968590   4.923000e+09
Iran                   5.707721   9.172000e+09
Australia             11.810810   5.386000e+09
Brazil                69.648030   1.214900e+10

I am grouping this dataframe using the Continents each country belongs to and also using the bins obtained by using pd.cut on the column % Renewable :

out, bins = pd.cut(Top15['% Renewable'].values, bins = 5, retbins = True)
grp = Top15.groupby(by = [ContinentDict, out])

where,

ContinentDict  = {'China':'Asia', 
              'United States':'North America', 
              'Japan':'Asia', 
              'United Kingdom':'Europe', 
              'Russian Federation':'Europe', 
              'Canada':'North America', 
              'Germany':'Europe', 
              'India':'Asia',
              'France':'Europe', 
              'South Korea':'Asia', 
              'Italy':'Europe', 
              'Spain':'Europe', 
              'Iran':'Asia',
              'Australia':'Australia', 
              'Brazil':'South America'}   

Now, I want to create a new dataframe with the same columns as df and another column given by 'Country'. The indexes of this new dataframe should be given by groupby objects keys hierarchically ('Continent', 'out'). After hours of trial, I see no way to do this. Any ideas?

Upvotes: 3

Views: 2717

Answers (1)

akuiper
akuiper

Reputation: 214957

You can create a multi-index from continent and cut and assign it back to your data frame:

out, bins = pd.cut(Top15['% Renewable'].values, bins = 5, retbins = True)
con = Top15.index.to_series().map(ContinentDict).values

Top15.reset_index(inplace=True)
Top15.index = pd.MultiIndex.from_arrays([con, out])
Top15

enter image description here

Upvotes: 1

Related Questions