Reputation: 809
I have the following dictionary:
ContinentDict = {'China':'Asia',
'United States':'North America',
'Japan':'Asia',
'United Kingdom':'Europe',
'Russian Federation':'Europe',
'Canada':'North America',
'Germany':'Europe',
'India':'Asia',
'France':'Europe',
'South Korea':'Asia',
'Italy':'Europe',
'Spain':'Europe',
'Iran':'Asia',
'Australia':'Australia',
'Brazil':'South America'}
I have binned the countries in this dictionary (keys) into continents (values).
from collections import defaultdict
dictionary = defaultdict(list)
for key, value in ContinentDict.items():
dictionary[value].append(key)
This has given me:
dictionary
defaultdict(<class 'list'>, {'Asia': ['China', 'Japan', 'India', 'South Korea', 'Iran'], 'North America': ['United States', 'Canada'], 'Europe': ['United Kingdom', 'Russian Federation', 'Germany', 'France', 'Italy', 'Spain'], 'Australia': ['Australia'], 'South America': ['Brazil']})
I also have the Pandas series Reducedset['estimate']:
Country
China 1.36765e+09
United States 3.17615e+08
Japan 1.27409e+08
United Kingdom 6.3871e+07
Russian Federation 1.435e+08
Canada 3.52399e+07
Germany 8.03697e+07
India 1.27673e+09
France 6.38373e+07
South Korea 4.98054e+07
Italy 5.99083e+07
Spain 4.64434e+07
Iran 7.70756e+07
Australia 2.3316e+07
Brazil 2.05915e+08
Name: estimate, dtype: object
I would like to create a hierarchical index from this dictionary, with the continent as the top of the hierarchy followed by the country.
I have tried the following:
totuple = dictionary.items()
index = pd.MultiIndex.from_tuples(index)
hierarchy = pop.reindex(index)
However, this has not worked.
Would anybody be able to give me a helping hand?
Upvotes: 1
Views: 170
Reputation: 862511
Create list of tuples and pass to MultiIndex.from_tuples
:
t = [(k, x) for k, v in dictionary.items() for x in v]
index = pd.MultiIndex.from_tuples(t)
print (index)
MultiIndex([( 'Asia', 'China'),
( 'Asia', 'Japan'),
( 'Asia', 'India'),
( 'Asia', 'South Korea'),
( 'Asia', 'Iran'),
('North America', 'United States'),
('North America', 'Canada'),
( 'Europe', 'United Kingdom'),
( 'Europe', 'Russian Federation'),
( 'Europe', 'Germany'),
( 'Europe', 'France'),
( 'Europe', 'Italy'),
( 'Europe', 'Spain'),
( 'Australia', 'Australia'),
('South America', 'Brazil')],
)
And then:
Reducedset = Reducedset.reindex(index, level=1)
print (Reducedset)
estimate
Asia China 1.367650e+09
Japan 1.274090e+08
India 1.276730e+09
South Korea 4.980540e+07
Iran 7.707560e+07
North America United States 3.176150e+08
Canada 3.523990e+07
Europe United Kingdom 6.387100e+07
Russian Federation 1.435000e+08
Germany 8.036970e+07
France 6.383730e+07
Italy 5.990830e+07
Spain 4.644340e+07
Australia Australia 2.331600e+07
South America Brazil 2.059150e+08
Another idea is use map
by original dictionary:
ContinentDict = {'China':'Asia',
'United States':'North America',
'Japan':'Asia',
'United Kingdom':'Europe',
'Russian Federation':'Europe',
'Canada':'North America',
'Germany':'Europe',
'India':'Asia',
'France':'Europe',
'South Korea':'Asia',
'Italy':'Europe',
'Spain':'Europe',
'Iran':'Asia',
'Australia':'Australia',
'Brazil':'South America'}
d = {'estimate': {'China': 1367650000.0, 'United States': 317615000.0, 'Japan': 127409000.0, 'United Kingdom': 63871000.0, 'Russian Federation': 143500000.0, 'Canada': 35239900.0, 'Germany': 80369700.0, 'India': 1276730000.0, 'France': 63837300.0, 'South Korea': 49805400.0, 'Italy': 59908300.0, 'Spain': 46443400.0, 'Iran': 77075600.0, 'Australia': 23316000.0, 'Brazil': 205915000.0}}
Reducedset = pd.DataFrame(d)
idx = Reducedset.index.map(ContinentDict)
Reducedset.index = [idx, Reducedset.index]
Reducedset = Reducedset.sort_index()
print (Reducedset)
estimate
Asia China 1.367650e+09
India 1.276730e+09
Iran 7.707560e+07
Japan 1.274090e+08
South Korea 4.980540e+07
Australia Australia 2.331600e+07
Europe France 6.383730e+07
Germany 8.036970e+07
Italy 5.990830e+07
Russian Federation 1.435000e+08
Spain 4.644340e+07
United Kingdom 6.387100e+07
North America Canada 3.523990e+07
United States 3.176150e+08
South America Brazil 2.059150e+08
Upvotes: 1