Reputation: 605
We have a sensor, which records 'x' when told by another sensor. What this mean is, the observation is made at random time and random frequency within a hour. Here is how data looks like
> df
date time x
1/1/2018 00:24:12 10
1/1/2018 00:47:17 14
1/1/2018 1:17:11 12
1/1/2018 1:34:34 17
1/1/2018 1:52:23 15
1/1/2018 2:10:59 12
and so on till 31/1/2018. To compare it to another dataset, I want to find value recorded at time nearest to hour mark. Ex:
date time x
1/1/2018 00 10
1/1/2018 01 14 (Since 00:47:17 is -13 minutes to 01 as compared to 1:17:11 which is + 17 minutes)
1/1/2018 02 15
Upvotes: 1
Views: 140
Reputation: 862406
Create DatetimeIndex
first, then date_range
with Timestamp.floor
and last DataFrame.reindex
with method='nearest'
:
df.index = pd.to_datetime(df['date'] + ' ' + df['time'])
rng = pd.date_range(df.index.min().floor('H'), df.index.max().floor('H'), freq='H')
df = df.reindex(rng, method='nearest')
print (df)
date time x
2018-01-01 00:00:00 1/1/2018 00:24:12 10
2018-01-01 01:00:00 1/1/2018 00:47:17 14
2018-01-01 02:00:00 1/1/2018 1:52:23 15
Last if necessary remove DatetimeIndex
:
df = df.reindex(rng, method='nearest').reset_index(drop=True)
print (df)
date time x
0 1/1/2018 00:24:12 10
1 1/1/2018 00:47:17 14
2 1/1/2018 1:52:23 15
Upvotes: 1