galah92
galah92

Reputation: 3991

Count values for specific interval for each given timestamp

Say I've got the following DataFrame:

In [46]: timestamp = pd.date_range(start='1/1/2018', end='1/2/2018', freq='T')

In [47]: df = pd.DataFrame(timestamp, columns=['timestamp'])

In [48]: df['user'] = np.random.randint(10, size=len(timestamp))

In [49]: df['clicked'] = np.random.randint(2, size=len(timestamp))

Where each row represent whether a user has clicked in a given timestamp.

I'd like to add a new column, where the value for each row will be the number of times the user of the current row has clicked (i.e has value of 1) in the past hour from the current row timestamp.

How can this be done?

Upvotes: 1

Views: 70

Answers (1)

giser_yugang
giser_yugang

Reputation: 6166

Try:

df = df.set_index('timestamp')
hour_count = df.groupby(['user'])['clicked'].rolling('1H').sum()
df =df.assign(rolling_sum=hour_count.reset_index(level=0, drop=True)).reset_index()

Upvotes: 2

Related Questions