Reputation: 1792
I have the following dataframe:
import pandas as pd
#Create DF
d = {
'Date': ['1/01/2021','2/01/2021','3/01/2021','4/01/2021','5/01/2021','6/01/2021','7/01/2021','8/01/2021','9/01/2021','10/01/2021','11/01/2021','12/01/2021','13/01/2021',
'14/01/2021','15/01/2021','16/01/2021'],
'Name': ['Joe','Joe','Joe','Joe','Joe','Joe','Joe','Joe','Joe','John','John','John','John','John','John','John'],
'Criteria':[30,35,2.5,3,6,15,20,1.5,4,25,50,75,2,4,9,2],
'Sum' : [-1,-1,1.5,-1,5,-1,-1,-1,3,-1,-1,-1,-1,3,-1,-1],
'Sum Divided by Count where Criteria <5' : [0.000,0.000,1.500,0.250,0.000,0.000,0.000,-0.167,0.625,0.000,0.000,0.000,-1.000,1.000,0.000,0.333]
}
df = pd.DataFrame(data=d)
df['Date'] = pd.to_datetime(df.Date,format='%d/%m/%Y')
df
The last column is calculating the sum of the column Sum
where Criteria
is less than 5
divided by the count of rows where Criteria
is less than 5
. And grouped by Name
. The red is hihglighting the rows that match the criteria.
What i am also trying to do is also add a forward fill to all rows but keep the initial rows as zeros.
Is all of the above possible in pandas?
My end result should be:
Any help would be much apreciated!
Upvotes: 2
Views: 139
Reputation: 863166
Repalce values below 5
to missing values in Sum
in DataFrame.assign
and then per groups call Expanding.mean
for divide sum by counts and last if values not match repalce them to 0
in Series.where
:
m = df['Criteria'] < 5
df['new'] = (df.assign(Sum = df['Sum'].where(m))
.groupby('Name')['Sum']
.expanding()
.mean()
.reset_index(level=0, drop=True)
.where(m)
.groupby(df['Name'])
.ffill()
.fillna(0))
print (df)
Date Name Criteria Sum Sum Divided by Count where Criteria <5 \
0 2021-01-01 Joe 30.0 -1.0 0.000
1 2021-01-02 Joe 35.0 -1.0 0.000
2 2021-01-03 Joe 2.5 1.5 1.500
3 2021-01-04 Joe 3.0 -1.0 0.250
4 2021-01-05 Joe 6.0 5.0 0.000
5 2021-01-06 Joe 15.0 -1.0 0.000
6 2021-01-07 Joe 20.0 -1.0 0.000
7 2021-01-08 Joe 1.5 -1.0 -0.167
8 2021-01-09 Joe 4.0 3.0 0.625
9 2021-01-10 John 25.0 -1.0 0.000
10 2021-01-11 John 50.0 -1.0 0.000
11 2021-01-12 John 75.0 -1.0 0.000
12 2021-01-13 John 2.0 -1.0 -1.000
13 2021-01-14 John 4.0 3.0 1.000
14 2021-01-15 John 9.0 -1.0 0.000
15 2021-01-16 John 2.0 -1.0 0.333
new
0 0.000000
1 0.000000
2 1.500000
3 0.250000
4 0.250000
5 0.250000
6 0.250000
7 -0.166667
8 0.625000
9 0.000000
10 0.000000
11 0.000000
12 -1.000000
13 1.000000
14 1.000000
15 0.333333
Upvotes: 1