Reputation: 7713
I have a dataframe like as shown below
data = pd.DataFrame({'day':['1','21','41','61','81','101','121','141','161','181','201','221'],'Sale':[1.08,0.9,0.72,0.58,0.48,0.42,0.37,0.33,0.26,0.24,0.22,0.11]})
I would like to fill the values for day 241
by computing the average of all records till day 221
. Similarly, I would like to compute the value for day 261
by computing the average of all records till day 241
and so on.
For ex: Compute the value for day n
by taking average of all values from day 1 to day n-21
.
I would like to do this upto day 1001
.
I tried the below but it isn't correct
df['day'] = df.iloc[:,1].rolling(window=all).mean()
How to create new rows for each day under day
column?
I expect my output to be like as shown below
Upvotes: 3
Views: 699
Reputation: 35636
It sounds like you're looking for an expanding mean:
import numpy as np
import pandas as pd
df = pd.DataFrame({'day': ['1', '21', '41', '61', '81', '101', '121', '141',
'161', '181', '201', '221'],
'Sale': [1.08, 0.9, 0.72, 0.58, 0.48, 0.42, 0.37, 0.33, 0.26,
0.24, 0.22, 0.11]})
# Generate Some new values
to_add = pd.DataFrame({'day': np.arange(241, 301, 20)})
# Add New Values To End of DataFrame
new_df = pd.concat((df, to_add)).reset_index(drop=True)
# Replace Values Where Sale is NaN with the expanding mean
new_df['Sale'] = np.where(new_df['Sale'].isna(),
new_df['Sale'].expanding().mean(),
new_df['Sale'])
print(new_df)
day Sale
0 1 1.080000
1 21 0.900000
2 41 0.720000
3 61 0.580000
4 81 0.480000
5 101 0.420000
6 121 0.370000
7 141 0.330000
8 161 0.260000
9 181 0.240000
10 201 0.220000
11 221 0.110000
12 241 0.475833
13 261 0.475833
14 281 0.475833
With replacing NaNs with 1 then averaging:
import numpy as np
import pandas as pd
df = pd.DataFrame({'day': ['1', '21', '41', '61', '81', '101', '121', '141',
'161', '181', '201', '221'],
'Sale': [1.08, 0.9, 0.72, 0.58, 0.48, 0.42, 0.37, 0.33, 0.26,
0.24, 0.22, 0.11 ]})
# Generate Some new values
to_add = pd.DataFrame({'day': np.arange(241, 301, 20)})
# Add New Values To End of DataFrame
new_df = pd.concat((df, to_add)).reset_index(drop=True)
# Replace Values Where Sale is NaN with the expanding mean
new_df['Sale'] = np.where(new_df['Sale'].isna(),
new_df['Sale'].fillna(1).shift().expanding().mean(),
new_df['Sale'])
print(new_df)
day Sale
0 1 1.080000
1 21 0.900000
2 41 0.720000
3 61 0.580000
4 81 0.480000
5 101 0.420000
6 121 0.370000
7 141 0.330000
8 161 0.260000
9 181 0.240000
10 201 0.220000
11 221 0.110000
12 241 0.475833
13 261 0.516154
14 281 0.550714
Upvotes: 6