Reputation: 81
I'm trying to calculate the "running count" of non zero values in a pandas data frame, and add it as a new column on the same dataframe.
I can use this for a running total of the values:
df['running_total'] = df.groupby('key_id')['payment'].cumsum()
I can use this for the running count:
df['trans_counts'] = df.groupby('key_id')['payment'].cumcount()
However, this count will include the non-zero 'payment' entries, and will just give me a running total of the key_id groups. How can we modify the cumcount() functionality to not increment when it sees a zero?
Upvotes: 1
Views: 671
Reputation: 323226
You can still use cumsum
df['running_total'] = df['payment'].ne(0).groupby(df['key_id']).cumsum()
Upvotes: 2