Reputation: 63
I have a dataset which has data-time and value as columns for each ID. I do some calculations on it but stuck while using recursive functions.
The dataset looks like this,
Date-Time Volume ID Load
10/22/2019 3862 10
10/23/2019 3800 10
10/24/2019 3700 10
10/25/2019 5000 10 Yes
10/26/2019 4900 10
10/27/2019 4800 10
10/22/2019 3862 11
10/23/2019 3800 11
10/24/2019 3700 11
10/25/2019 5000 11 Yes
10/26/2019 4900 11
10/27/2019 4800 11
The output i need for load_date is,
Date-Time Volume ID Load LoadDate
10/22/2019 3862 10 0
10/23/2019 3800 10 0
10/24/2019 3700 10 0
10/25/2019 5000 10 Yes 10/25/2019
10/26/2019 4900 10 10/25/2019
10/27/2019 4800 10 10/25/2019
10/22/2019 3862 11 0
10/23/2019 3800 11 0
10/24/2019 3700 11 0
10/25/2019 5000 11 Yes 10/25/2019
10/26/2019 4900 11 10/25/2019
10/27/2019 4800 11 10/25/2019
Upvotes: 2
Views: 37
Reputation: 23099
IIUC,
We can index at the Yes value and forward fill any dates with some index filtering and .loc
assingment.
idx = df.loc[df['Load'] == 'Yes'].index # get all index values for Yes.
df['LoadDate'] = np.nan # create your col.
df.loc[idx, 'LoadDate'] = df['Date-Time']
Groupby and Create your
df['LoadDate'] = (df.groupby('ID')['LoadDate'].ffill()).fillna(0)
#group by ID and ffill the load date and fill and nan's as 0.
which leaves us with.
print(df)
Date-Time Volume ID Load LoadDate
0 10/22/2019 3862 10 0
1 10/23/2019 3800 10 0
2 10/24/2019 3700 10 0
3 10/25/2019 5000 10 Yes 10/25/2019
4 10/26/2019 4900 10 10/25/2019
5 10/27/2019 4800 10 10/25/2019
6 10/22/2019 3862 11 0
7 10/23/2019 3800 11 0
8 10/24/2019 3700 11 0
9 10/25/2019 5000 11 Yes 10/25/2019
10 10/26/2019 4900 11 10/25/2019
11 10/27/2019 4800 11 10/25/2019
Upvotes: 2