dexis
dexis

Reputation: 1

Pandas total count each day

I have a large dataset (df) with lots of columns and I am trying to get the total number of each day.

    |datetime|id|col3|col4|col...
1   |11-11-2020|7|col3|col4|col...
2   |10-11-2020|5|col3|col4|col...
3   |09-11-2020|5|col3|col4|col...
4   |10-11-2020|4|col3|col4|col...
5   |10-11-2020|4|col3|col4|col...
6   |07-11-2020|4|col3|col4|col...

I want my result to be something like this

    |datetime|id|col3|col4|col...|Count
6   |07-11-2020|4|col3|col4|col...| 1
3              |5|col3|col4|col...| 1
2   |10-11-2020|5|col3|col4|col...| 1
4              |4|col3|col4|col...| 2
1   |11-11-2020|7|col3|col4|col...| 1

I tried to use resample like this df = df.groupby(['id','col3', pd.Grouper(key='datetime', freq='D')]).sum().reset_index() and this is my result. I am still new to programming and Pandas but I have read up on pandas docs and am still unable to do it.

    |datetime|id|col3|col4|col...
6   |07-11-2020|4|col3|1|0.0
3   |07-11-2020|5|col3|1|0.0
2   |10-11-2020|5|col3|1|0.0
4   |10-11-2020|4|col3|2|0.0
1   |11-11-2020|7|col3|1|0.0

Upvotes: 0

Views: 939

Answers (2)

Danillo Matos
Danillo Matos

Reputation: 51

If you want the count values for all columns based only on the date, then:

df.groupby('datetime').count()

And you'll get a DataFrame who has the date time as the index and the column cells representing the number of entries for that given index.

Upvotes: 0

MarcelloDG
MarcelloDG

Reputation: 98

try this:

df = df.groupby(['datetime','id','col3']).count()

Upvotes: 1

Related Questions