Reputation: 9212
I have sample data like this.
date time option_type open high low close volume
6031 9/27/2018 09:17 CE 11500 0.15 0.15 0.15 0.15 1500
6131 9/27/2018 15:19 CE 11500 0.05 0.05 0.05 0.05 1500
6132 9/27/2018 15:22 CE 11500 0.05 0.05 0.05 0.05 75
6133 9/27/2018 15:24 CE 11500 0.05 0.05 0.05 0.05 225
6134 9/27/2018 15:25 CE 11500 0.05 0.05 0.05 0.05 75
6135 9/27/2018 15:26 CE 11500 0.05 0.05 0.05 0.05 600
Some of the rows are missing over there. For example 09:15, 09:16 then 15:20, 15:21 etc.
I want to populate missing rows with previous row value in case of 15:20/15:21 and next available row in case of 09:15/09:16. 09:17 values will be used for 09:15/09:16. 15:20 values will be used for 15:21/15:22.
could you please help me on this. Thanks in advance and appreciate your efforts and time.
Upvotes: 0
Views: 111
Reputation: 4110
Step 1: Finding difference in time on consecutive rows:
df['deltaT'] = df.time.to_series().diff().dt.seconds.div(60, fill_value=0)
The above will give you a new column on how many mins diff. is between the consecutive rows
Step2: Replicate rows based on new column deltaT
df.reindex(df.index.repeat(df.deltaT))
Step3: Building logic to increment time column
df['time'] = pd.to_timedelta(df['time']) + pd.to_timedelta(df['deltaT'], unit='m')
Still struggling to give you last part.
If you find this helpful and can build upon after this.Great!!
Upvotes: 1
Reputation: 13
I think you are looking for something like this :
df['time']=df['time'].fillna(method="ffill") #to carry the values forward
df['time']=df['time'].fillna(method="bfill") #to carry the values backwards
df
Upvotes: 0