Reputation: 103

Python pandas most common value over rolling date window

I have a pandas dataframe containing the below data, and i would like to add a new column which, for each date, returns the most frequently occurring 'weather_type' over the previous 3 days. Where the result is a tie, i'd like the most recent 'weather_type' to be returned.

d = {'date': ['17/02/2017', '18/02/2017', '19/02/2017', '20/02/2017', 
'21/02/2017', '22/02/2017'], 'precipmm': [1, 0, 3, 2, 7, 8], 'weather_type':
['rain', 'sun', 'rain', 'snow', 'snow', 'rain']}
df = pd.DataFrame(data=d)
df['date']=pd.to_datetime(df['date'], format='%d/%m/%Y')
df['rollingsum_precipmm']=df['precipmm'].rolling(window=3).sum()

I've already managed to create a new column containing the sum of the total 'precipmm' over the last 3 days using the below:

df['rollingsum_precipmm']=df['precipmm'].rolling(window=3).sum()

I suspect the answer revolves around this, but as yet i've been unable to find a solution.

Any help much appreciated as always

Colin

Upvotes: 7

Answers (2)

Ari

Reputation: 11

For the result to be appeared in a new column:

df=df.assign(new_column=df['precipmm'].rolling(window=7).apply(lambda x: mode(x)[0]))

Upvotes: 1

Yilun Zhang

Reputation: 9018

To get rolling mode, you can do:

from scipy.stats import mode
df['precipmm'].rolling(window=7).apply(lambda x: mode(x)[0])

Upvotes: 10

Python pandas most common value over rolling date window

Answers (2)

Related Questions