Reputation: 367
I have a table with minute data. Some rows are missing in this table.
import pandas as pd
df = pd.DataFrame([
("2019-02-15 17:56:00", 11, 10),
("2019-02-15 17:57:00", 22, 10),
("2019-02-15 18:00:00", 33, 20),
("2019-02-15 18:01:00", 44, 20),
("2019-02-15 18:02:00", 55, 20),
("2019-02-16 10:01:00", 66, 50),
("2019-02-16 10:02:00", 77, 50)],
columns=["date_time", "last_val", "sum_val"])
I need to get a table with hourly data. Where in the last_val
column will be the last value in an hour. And in the sum_val
column there will be the sum of all values per hour.
To make the resulting table look like this:
Upvotes: 1
Views: 1100
Reputation: 323386
Your groupkey is not very easy to find , first need slice the date_time
till hour
and convert back to datetime
object , then just groupby
it with agg
groupkey=pd.to_datetime(df.date_time.dt.strftime('%Y-%m-%d %H'))
df.groupby(groupkey).agg({'last_val':'last','sum_val':'sum'})
Out[141]:
last_val sum_val
date_time
2019-02-15 17:00:00 22 20
2019-02-15 18:00:00 55 60
2019-02-16 10:00:00 77 100
Upvotes: 2