how to group by and transform the count back to the dataframe

Question

Say I have the following df:

years = []
months = []
ys = [2003,2003,2004,2005]

for y in ys:
    for i in range(1,4):
        years.append(y)
        months.append(i)

df= pd.DataFrame({"year":years,'month':months})
df

    year    month
0   2003    1
1   2003    2
2   2003    3
3   2003    1
4   2003    2
5   2003    3
6   2004    1
7   2004    2
8   2004    3
9   2005    1
10  2005    2
11  2005    3

note that 2003 repeats itself.

I wish to group by year and month and transform back to the df so the result would be:

    year    month   count
0   2003    1        1
1   2003    2        2
2   2003    3        3
3   2003    1        1
4   2003    2        2
5   2003    3        3
6   2004    1        4
7   2004    2        5
8   2004    3        6
9   2005    1        7
10  2005    2        8 
11  2005    3        9

I tired df['count'] = df.groupby(['year','month']).transform('count') but I get 'Wrong number of items passed 0, placement implies 1'

Andrej Kesely · Accepted Answer

Use .ngroup():

df["count"] = df.groupby(["year", "month"]).ngroup() + 1
print(df)

Prints:

    year  month  count
0   2003      1      1
1   2003      2      2
2   2003      3      3
3   2003      1      1
4   2003      2      2
5   2003      3      3
6   2004      1      4
7   2004      2      5
8   2004      3      6
9   2005      1      7
10  2005      2      8
11  2005      3      9

how to group by and transform the count back to the dataframe

Answers (2)

Related Questions