user9399405
user9399405

Reputation:

Group rows pandas

Background: I have the following dataframe:

import pandas as pd
d = {'day': ["t", "m", "m", "w", "t", "m","w"], 
     'month': ["01", "01", "01", "01", "02","02","02"], 
     'count': [1, 1, 1, 1,1,1,1]}
df = pd.DataFrame(data=d)

I group by day and month:

df.groupby(by=['day','month']).count()

Output:

day  month count    
m    01     2
     02     1
t    01     1
     02     1
w    01     1
     02     1

From here, I would like to organize the data to obtain the following output:

Desired Output:

day  month count    
m    01     2
t    01     1
w    01     1
m    02     1
t    02     1
w    02     1           

I tried df.sort_values('month') and df.sort_values('day') but it doesn't quite give me what I am looking for

Question: What line(s) of code do I need to add to get my desired output?

Upvotes: 3

Views: 76

Answers (4)

piRSquared
piRSquared

Reputation: 294258

The groupby automatically sorts the index. Instead, switch the order you are grouping by then swap the levels

df.groupby(by=['month', 'day']).count().swaplevel(0, 1)

           count
day month       
m   01         2
t   01         1
w   01         1
m   02         1
t   02         1
w   02         1

Upvotes: 2

Scott Boston
Scott Boston

Reputation: 153460

Use sort_index with level parameter:

df.groupby(by=['day','month']).count().sort_index(level=1)

Output:

           count
day month       
m   01         2
t   01         1
w   01         1
m   02         1
t   02         1
w   02         1

and reset_index if you wish.

df.groupby(by=['day','month']).count().sort_index(level=1).reset_index()

  day month  count
0   m    01      2
1   t    01      1
2   w    01      1
3   m    02      1
4   t    02      1
5   w    02      1

Upvotes: 2

Aditya
Aditya

Reputation: 2520

import pandas as pd
d = {'day': ["t", "m", "m", "w", "t", "m","w"], 
     'month': ["01", "01", "01", "01", "02","02","02"], 
     'count': [1, 1, 1, 1,1,1,1]}
df = pd.DataFrame(data=d)

df.groupby(['day','month']).count().reset_index()

Output(you can now apply sort()

    day month count
0   m    01    2

1   m    02    1

2   t    01    1

3   t    02    1

4   w    01    1

5   w    02    1

Upvotes: 1

ALollz
ALollz

Reputation: 59549

Here you go. It only happens to get the day ordering correct, but you might want to convert them to actually 0-6 for days of the week if you have more days later.

df.groupby(by=['day','month'], as_index=False).count().sort_values(by=['month', 'day'])

    day month   count
0   m   01  2
2   t   01  1
4   w   01  1
1   m   02  1
3   t   02  1
5   w   02  1

Upvotes: 2

Related Questions