Reputation: 2253
I have the following df
dates Final
2020-01-01 00:15:00 94.7
2020-01-01 00:30:00 94.1
2020-01-01 00:45:00 94.1
2020-01-01 01:00:00 95.0
2020-01-01 01:15:00 96.6
2020-01-01 01:30:00 98.4
2020-01-01 01:45:00 99.8
2020-01-01 02:00:00 99.8
2020-01-01 02:15:00 98.0
2020-01-01 02:30:00 95.1
2020-01-01 02:45:00 91.9
2020-01-01 03:00:00 89.5
The entire dataset is till 2021-01-01 00:00:00 95.6
with a gap of 15mins.
Since the freq is 15mins, I would like to change it to 1 hour and maybe drop the middle values
Expected output
dates Final
2020-01-01 01:00:00 95.0
2020-01-01 02:00:00 99.8
2020-01-01 03:00:00 89.5
With the last row being 2021-01-01 00:00:00 95.6
How can this be done?
Thanks
Upvotes: 3
Views: 658
Reputation: 328
If you're doing data analysis or data science I don't think dropping the middle values is a good approach at all! You should sum them I guess (I don't know about your use case but I know some stuff about Time Series data).
Upvotes: 0
Reputation: 30920
Use Series.dt.minute
to performance a boolean indexing
:
df_filtered = df.loc[df['dates'].dt.minute.eq(0)]
#if necessary
#df_filtered = df.loc[pd.to_datetime(df['dates']).dt.minute.eq(0)]
print(df_filtered)
dates Final
3 2020-01-01 01:00:00 95.0
7 2020-01-01 02:00:00 99.8
11 2020-01-01 03:00:00 89.5
Upvotes: 3