skmr
skmr

Reputation: 112

Pandas dataframe - multiindex for both rows and columns?

Imagine this is my input data:

    data = [("France",    "Paris",      "Male",   "1"),
            ("France",    "Paris",      "Female", "6"),
            ("France",    "Nice",       "Male",   "2"),
            ("France",    "Nice",       "Female", "7"),
            ("Germany",   "Berlin",     "Male",   "3"),
            ("Germany",   "Berlin",     "Female", "8"),
            ("Germany",   "Munchen",    "Male",   "4"),
            ("Germany",   "Munchen",    "Female", "9"),
            ("Germany",   "Koln",       "Male",   "5"),
            ("Germany",   "Koln",       "Female", "10")]

I'd like to put it into a dataframe like this:

Country City       Sex
                   Male     Female
France  Paris       1         6
        Nice        2         7
Germany Berlin      3         8
        Munchen     4         9
        Koln        5         10

The first part is easy:

df = pd.DataFrame(data, columns=["country", "city", "sex", "count"])
df = df.set_index(["country", "city"])

Gives me output:

                   sex  count
country city                 
France  Paris      Male     1
        Paris    Female     6
        Nice       Male     2
        Nice     Female     7
Germany Berlin     Male     3
        Berlin   Female     8
        Munchen    Male     4
        Munchen  Female     9
        Koln       Male     5
        Koln     Female    10

So the rows are ok, but now I'd like to put the values from 'sex' column into a column multiindex. Is it possible to do so, if so, how?

Upvotes: 3

Views: 73

Answers (2)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

Another method using pivot inplace of unstack (both almost mean the same) i.e

df.set_index(['country','city']).pivot(columns='sex')
               
                   count     
sex             Female Male
country city               
France  Nice         7    2
        Paris        6    1
Germany Berlin       8    3
        Koln        10    5
        Munchen      9    4

Upvotes: 0

jezrael
jezrael

Reputation: 862541

Add column Sex to list in set_index and call unstack:

df = df.set_index(["country", "city",'sex']).unstack()
#data cleaning - remove columns name sex and rename column count
df = df.rename_axis((None, None),axis=1).rename(columns={'count':'Sex'})
print (df)
                   Sex     
                Female Male
country city               
France  Nice         7    2
        Paris        6    1
Germany Berlin       8    3
        Koln        10    5
        Munchen      9    4

Upvotes: 3

Related Questions