Carlo Allocca
Carlo Allocca

Reputation: 649

Resemplig and adding missing rows

I have got a dataframe that represent 1 Sec of data that supposed to be sample at 100 Hz.

I would like to 1) resample it which at the rate of 10 Millisecond with "avg" approach for each column and 2) add extra rows based on interpolation approach when missing, as in the following:

DF_input:

ephoc_as_datatime         att1 att2
2000-01-01 11:22:37.130    0    4
2000-01-01 11:22:37.138    1    5
2000-01-01 11:22:37.149    2    6
2000-01-01 11:22:37.156    3    7
2000-01-01 11:22:37.165    4    8
2000-01-01 11:22:37.168    5    9
2000-01-01 11:22:37.169    3    7
2000-01-01 11:22:37.567    7    3
2000-01-01 11:22:38.120    8    4

DF_output:

ephoc_as_datatime         att1 att2
2000-01-01 11:22:37.130    0    4
2000-01-01 11:22:37.140    1    5
2000-01-01 11:22:37.150    2    6
2000-01-01 11:22:37.160    3    7
2000-01-01 11:22:37.170    4    8
....adding the missing one
2000-01-01 11:22:37.570    7    3
....adding the missing one
2000-01-01 11:22:38.120    8    4

I know that I should be using resample and interpolate. Please, any suggestion would be very appreciated.

Many Thanks, Best Regards, Carlo

Upvotes: 1

Views: 45

Answers (1)

jezrael
jezrael

Reputation: 862601

I think you need resample by 10L for 10ms with interpolate:

#if necessary convert to datetimes
#df['ephoc_as_datatime'] = pd.to_datetime(df['ephoc_as_datatime'])

df = df.resample('10L', on='ephoc_as_datatime').mean().interpolate()
print (df.head(20))
                          att1   att2
ephoc_as_datatime                    
2000-01-01 11:22:37.130  0.500  4.500
2000-01-01 11:22:37.140  2.000  6.000
2000-01-01 11:22:37.150  3.000  7.000
2000-01-01 11:22:37.160  4.000  8.000
2000-01-01 11:22:37.170  4.075  7.875
2000-01-01 11:22:37.180  4.150  7.750
2000-01-01 11:22:37.190  4.225  7.625
2000-01-01 11:22:37.200  4.300  7.500
2000-01-01 11:22:37.210  4.375  7.375
2000-01-01 11:22:37.220  4.450  7.250
2000-01-01 11:22:37.230  4.525  7.125
2000-01-01 11:22:37.240  4.600  7.000
2000-01-01 11:22:37.250  4.675  6.875
2000-01-01 11:22:37.260  4.750  6.750
2000-01-01 11:22:37.270  4.825  6.625
2000-01-01 11:22:37.280  4.900  6.500
2000-01-01 11:22:37.290  4.975  6.375
2000-01-01 11:22:37.300  5.050  6.250
2000-01-01 11:22:37.310  5.125  6.125
2000-01-01 11:22:37.320  5.200  6.000

Detail:

print(df.resample('10L', on='ephoc_as_datatime').mean().head(20))
                         att1  att2
ephoc_as_datatime                  
2000-01-01 11:22:37.130   0.5   4.5
2000-01-01 11:22:37.140   2.0   6.0
2000-01-01 11:22:37.150   3.0   7.0
2000-01-01 11:22:37.160   4.0   8.0
2000-01-01 11:22:37.170   NaN   NaN
2000-01-01 11:22:37.180   NaN   NaN
2000-01-01 11:22:37.190   NaN   NaN
2000-01-01 11:22:37.200   NaN   NaN
2000-01-01 11:22:37.210   NaN   NaN
2000-01-01 11:22:37.220   NaN   NaN
2000-01-01 11:22:37.230   NaN   NaN
2000-01-01 11:22:37.240   NaN   NaN
2000-01-01 11:22:37.250   NaN   NaN
2000-01-01 11:22:37.260   NaN   NaN
2000-01-01 11:22:37.270   NaN   NaN
2000-01-01 11:22:37.280   NaN   NaN
2000-01-01 11:22:37.290   NaN   NaN
2000-01-01 11:22:37.300   NaN   NaN
2000-01-01 11:22:37.310   NaN   NaN
2000-01-01 11:22:37.320   NaN   NaN

Upvotes: 2

Related Questions