Reputation: 649
I would like to add extra rows to a DataFrame as shown in the following:
Input DF:
id_B ts_B colum_B1 colum_B2 ProgPercentage TimeLeft
0 id1 2017-04-27 01:35:00 A phase1 2 02:12:00
1 id1 2017-04-27 01:35:00 A phase2 3 02:11:00
2 id1 2017-04-27 01:35:00 B phase2 97 00:03:00
3 id1 2017-04-27 01:36:00 B phase3 98 00:02:00
output DF: given a time_delta= 1 minute, I need to add extra rows backward - meaning that colum_ProgPercentage is 0.0 - and forward - meaning that colum_ProgPercentage is 100.0, where all the values are the same except the ts_B where backward would apply minus of "time_delta" to the ts_B and forward would apply plus of "time_delta" to the ts_B.
id_B ts_B colum_B1 colum_B2 ProgPercentage TimeLeft
0 id1 2017-04-27 01:33:00 A phase1 0 02:14:00
1 id1 2017-04-27 01:34:00 A phase1 1 02:13:00
2 id1 2017-04-27 01:35:00 A phase1 2 02:12:00
3 id1 2017-04-27 01:35:00 A phase2 3 02:11:00
4 id1 2017-04-27 01:35:00 B phase2 97 00:03:00
5 id1 2017-04-27 01:36:00 B phase3 98 00:02:00
6 id1 2017-04-27 01:37:00 B phase3 99 00:01:00
6 id1 2017-04-27 01:38:00 B phase3 100 00:00:00
Any help would be very appreciated. Thanks in advance. Best Regards, C
Upvotes: 0
Views: 50
Reputation: 40888
This should work for you:
def custom_fill(df, mode):
if mode == 'down':
newvals = np.arange(df['colum_ProgPercentage '].iloc[0])
newdates = pd.date_range(end=df.ts_B.iloc[0], freq='1Min',
periods=newvals.shape[0]+1, closed='left')
elif mode == 'up':
newvals = np.arange(df['colum_ProgPercentage '].iloc[-1] + 1, 101)
newdates = pd.date_range(start=df.ts_B.iloc[-1], freq='1Min',
periods=newvals.shape[0]+1, closed='right')
newdf = pd.DataFrame({'id_B' : df.id_B.iloc[0], 'ts_B' : newdates,
'colum_B' : df.colum_B.iloc[0],
'colum_ProgPercentage ' : newvals})
return newdf
new1 = custom_fill(df[df.colum_B=='A'], mode='down')
new2 = custom_fill(df[df.colum_B=='B'], mode='up')
res = (pd.concat((df, new1, new2))
.sort_values(['colum_B', 'colum_ProgPercentage '])[df.columns]
.reset_index(drop=True))
print(res)
id_B ts_B colum_B colum_ProgPercentage
0 id1 2017-04-27 01:33:00 A 0
1 id1 2017-04-27 01:34:00 A 1
2 id1 2017-04-27 01:35:00 A 2
3 id1 2017-04-27 01:35:00 A 3
4 id1 2017-04-27 01:35:00 B 97
5 id1 2017-04-27 01:36:00 B 98
6 id1 2017-04-27 01:37:00 B 99
7 id1 2017-04-27 01:38:00 B 100
Note that one of your columns has a space at the end of it.
Upvotes: 1