How to populate missing row with previous or next row in pandas python

Question

I have sample data like this.

date        time option_type   open    high     low   close  volume

6031    9/27/2018   09:17   CE  11500   0.15    0.15    0.15    0.15    1500


6131    9/27/2018   15:19   CE  11500   0.05    0.05    0.05    0.05    1500
6132    9/27/2018   15:22   CE  11500   0.05    0.05    0.05    0.05    75
6133    9/27/2018   15:24   CE  11500   0.05    0.05    0.05    0.05    225
6134    9/27/2018   15:25   CE  11500   0.05    0.05    0.05    0.05    75
6135    9/27/2018   15:26   CE  11500   0.05    0.05    0.05    0.05    600

Some of the rows are missing over there. For example 09:15, 09:16 then 15:20, 15:21 etc.

I want to populate missing rows with previous row value in case of 15:20/15:21 and next available row in case of 09:15/09:16. 09:17 values will be used for 09:15/09:16. 15:20 values will be used for 15:21/15:22.

could you please help me on this. Thanks in advance and appreciate your efforts and time.

Rahul Agarwal · Accepted Answer

Step 1: Finding difference in time on consecutive rows:

df['deltaT'] = df.time.to_series().diff().dt.seconds.div(60, fill_value=0)

The above will give you a new column on how many mins diff. is between the consecutive rows

Step2: Replicate rows based on new column deltaT

df.reindex(df.index.repeat(df.deltaT))

Step3: Building logic to increment time column

df['time'] = pd.to_timedelta(df['time']) + pd.to_timedelta(df['deltaT'], unit='m')

Still struggling to give you last part.

If you find this helpful and can build upon after this.Great!!

How to populate missing row with previous or next row in pandas python

Answers (2)

Related Questions