Carlo Allocca
Carlo Allocca

Reputation: 649

How to add rows in backward and forwards to a Dataframe

I would like to add extra rows to a DataFrame as shown in the following:

Input DF:

  id_B                ts_B colum_B1 colum_B2     ProgPercentage   TimeLeft 
0  id1 2017-04-27 01:35:00       A   phase1         2             02:12:00  
1  id1 2017-04-27 01:35:00       A   phase2         3             02:11:00
2  id1 2017-04-27 01:35:00       B   phase2         97            00:03:00
3  id1 2017-04-27 01:36:00       B   phase3         98            00:02:00

output DF: given a time_delta= 1 minute, I need to add extra rows backward - meaning that colum_ProgPercentage is 0.0 - and forward - meaning that colum_ProgPercentage is 100.0, where all the values are the same except the ts_B where backward would apply minus of "time_delta" to the ts_B and forward would apply plus of "time_delta" to the ts_B.

  id_B                ts_B colum_B1 colum_B2     ProgPercentage   TimeLeft 
0  id1 2017-04-27 01:33:00       A   phase1         0             02:14:00  
1  id1 2017-04-27 01:34:00       A   phase1         1             02:13:00  
2  id1 2017-04-27 01:35:00       A   phase1         2             02:12:00  
3  id1 2017-04-27 01:35:00       A   phase2         3             02:11:00
4  id1 2017-04-27 01:35:00       B   phase2         97            00:03:00
5  id1 2017-04-27 01:36:00       B   phase3         98            00:02:00
6  id1 2017-04-27 01:37:00       B   phase3         99            00:01:00
6  id1 2017-04-27 01:38:00       B   phase3         100           00:00:00

Any help would be very appreciated. Thanks in advance. Best Regards, C

Upvotes: 0

Views: 50

Answers (1)

Brad Solomon
Brad Solomon

Reputation: 40888

This should work for you:

def custom_fill(df, mode):
    if mode == 'down':
        newvals =  np.arange(df['colum_ProgPercentage '].iloc[0])
        newdates = pd.date_range(end=df.ts_B.iloc[0], freq='1Min', 
                                  periods=newvals.shape[0]+1, closed='left')            
    elif mode == 'up':
        newvals = np.arange(df['colum_ProgPercentage '].iloc[-1] + 1, 101)
        newdates = pd.date_range(start=df.ts_B.iloc[-1], freq='1Min', 
                                  periods=newvals.shape[0]+1, closed='right')

    newdf = pd.DataFrame({'id_B' : df.id_B.iloc[0], 'ts_B' : newdates, 
                          'colum_B' : df.colum_B.iloc[0], 
                          'colum_ProgPercentage ' : newvals})    
    return newdf

new1 = custom_fill(df[df.colum_B=='A'], mode='down')
new2 = custom_fill(df[df.colum_B=='B'], mode='up')
res = (pd.concat((df, new1, new2))
           .sort_values(['colum_B', 'colum_ProgPercentage '])[df.columns]
           .reset_index(drop=True))

print(res)
  id_B                ts_B colum_B  colum_ProgPercentage 
0  id1 2017-04-27 01:33:00       A                    0  
1  id1 2017-04-27 01:34:00       A                    1  
2  id1 2017-04-27 01:35:00       A                    2  
3  id1 2017-04-27 01:35:00       A                    3  
4  id1 2017-04-27 01:35:00       B                   97  
5  id1 2017-04-27 01:36:00       B                   98  
6  id1 2017-04-27 01:37:00       B                   99  
7  id1 2017-04-27 01:38:00       B                  100  

Note that one of your columns has a space at the end of it.

Upvotes: 1

Related Questions