Reputation: 112
Imagine this is my input data:
data = [("France", "Paris", "Male", "1"),
("France", "Paris", "Female", "6"),
("France", "Nice", "Male", "2"),
("France", "Nice", "Female", "7"),
("Germany", "Berlin", "Male", "3"),
("Germany", "Berlin", "Female", "8"),
("Germany", "Munchen", "Male", "4"),
("Germany", "Munchen", "Female", "9"),
("Germany", "Koln", "Male", "5"),
("Germany", "Koln", "Female", "10")]
I'd like to put it into a dataframe like this:
Country City Sex
Male Female
France Paris 1 6
Nice 2 7
Germany Berlin 3 8
Munchen 4 9
Koln 5 10
The first part is easy:
df = pd.DataFrame(data, columns=["country", "city", "sex", "count"])
df = df.set_index(["country", "city"])
Gives me output:
sex count
country city
France Paris Male 1
Paris Female 6
Nice Male 2
Nice Female 7
Germany Berlin Male 3
Berlin Female 8
Munchen Male 4
Munchen Female 9
Koln Male 5
Koln Female 10
So the rows are ok, but now I'd like to put the values from 'sex' column into a column multiindex. Is it possible to do so, if so, how?
Upvotes: 3
Views: 73
Reputation: 30605
Another method using pivot inplace of unstack (both almost mean the same) i.e
df.set_index(['country','city']).pivot(columns='sex')
count sex Female Male country city France Nice 7 2 Paris 6 1 Germany Berlin 8 3 Koln 10 5 Munchen 9 4
Upvotes: 0
Reputation: 862541
Add column Sex
to list
in set_index
and call unstack
:
df = df.set_index(["country", "city",'sex']).unstack()
#data cleaning - remove columns name sex and rename column count
df = df.rename_axis((None, None),axis=1).rename(columns={'count':'Sex'})
print (df)
Sex
Female Male
country city
France Nice 7 2
Paris 6 1
Germany Berlin 8 3
Koln 10 5
Munchen 9 4
Upvotes: 3