Tom Kealy
Tom Kealy

Reputation: 2669

Counting the number of occurrences per year in a groupby

I have a pandas dataframe which looks like:

df = pd.DataFrame(data={'id':[1234, 1234, 1234, 1234, 1234], 'year':['2017', '2017', '2018', '2018', '2018'], 'count_to_today':[1, 2, 3, 3, 4]})
df
     id  year  count_to_today
0  1234  2017               1
1  1234  2017               2
2  1234  2018               3
3  1234  2018               3
4  1234  2018               4

And I need to count how many times count_to_today happens in each year per id. i.e. I have a running count since the beginning of time, and I want to count the number of times it increments per year.

           count_in_year
id   year               
1234 2017              2
     2018              2

I'm a bit confused about how to do this. I know I need to groupby id and year but I can't figure out how to get .count() or .value_counts() to give me the counts per year.

Upvotes: 4

Views: 1984

Answers (3)

Hasitha Amarathunga
Hasitha Amarathunga

Reputation: 2005

Use this structure:

df[['ID','Year']].groupby('Year').count()

and

df[['ID','Year']].groupby('Year').agg('count')

I hope this will work fine.Try this

Upvotes: 0

cs95
cs95

Reputation: 402413

You can use diff and groupby:

df.count_to_today.diff().ne(0).groupby([df.id, df.year]).sum()

id    year
1234  2017    2.0
      2018    2.0
Name: count_to_today, dtype: float64

(df.count_to_today.diff()
   .ne(0)
   .groupby([df.id, df.year])
   .sum()
   .astype(int)
   .reset_index())

     id  year  count_to_today
0  1234  2017               2
1  1234  2018               2

Upvotes: 2

Shahir Ansari
Shahir Ansari

Reputation: 1848

If you want to count ID per Year try using -

df[['ID','Year']].groupby('Year').count()

or-

df[['ID','Year']].groupby('Year').agg('count')

Change variables as you want to get your result.

Upvotes: 3

Related Questions